Published: September 24, 2024, Last updated: December 10, 2024
Before translating text from one language to another, you must first determine what language is used in the given text. Previously, translation required uploading the text to a cloud service, performing the translation on the server, then downloading the results.
The Language Detector API uses inference on-device so you can improve your privacy story. While it's possible to ship a specific library which does this, it would require additional resources to download.
Availability
- Join the Language Detector API origin trial, running in Chrome 132 to 135, to test the API with real users in production. Origin trials enable the feature for all users on your origin on Chrome.
- Follow our implementation in Chrome Status.
- The Language Detector and Translator API proposal is open to discussion.
- Join the early preview program for an early look at new built-in AI APIs and access to discussion on our mailing list.
Sign up for the origin trial
To start using the Language Detector API, follow these steps:
- Acknowledge Google's Generative AI Prohibited Uses Policy.
- Go to the Language Detector API origin trial.
- Click Register and fill out the form.
- In the Web origin field, provide your
origin
or extension ID,
chrome-extension://YOUR_EXTENSION_ID
.
- In the Web origin field, provide your
origin
or extension ID,
- To submit, click Register.
- Copy the token provided, and add it to every web page on your origin or
file for your Extension, on which you want the trial to be enabled.
- If you're building an Extension, follow the Extensions origin trial instructions
- Start using the Language Detection API.
Learn more about how to get started with origin trials.
Add support to localhost
To access the Language Detection API on localhost
during the origin trial, you
must update Chrome to the
latest version. Then, follow these steps:
- Go to
chrome://flags/#optimization-guide-on-device-model
. - Select Enabled BypassPerfRequirement. This skips performance checks and VRAM requirements, which may prevent Gemini Nano from downloading on your device.
- Go to
chrome://flags/#language-detection-api
. - Select Enabled.
- Click Relaunch or restart Chrome.
Example use cases
The Language Detector API is primarily useful in the following scenarios:
- Determine the language of input text, so it can be translated.
- Determine the language of input text, so the correct model can be loaded for language-specific tasks, such as toxicity detection.
- Determine the language of input text, so it can be labeled correctly, for example, in online social networking sites.
- Determine the language of input text, so an app's interface can be adjusted accordingly. For example, on a Belgian site to only show the interface relevant to users who speak French.
Use the Language Detector API
The Language Detector API is part of the larger family of the Translator API. First, run feature detection to see if the browser supports the Language Detector API.
if ('ai' in self && 'languageDetector' in self.ai)
// The Language Detector API is available.
}
Model download
Language detection depends on a model that is fine-tuned for the specific task of detecting languages. While the API is built in the browser, the model is downloaded on-demand the first time a site tries to use the API. In Chrome, this model is very small by comparison with other models. In fact, it might already be present given that this model is also used by Chrome browser features.
To see if the model is ready to use, call the asynchronous
self.ai.languageDetector.capabilities()
function and inspect the available
field.
There are three possible responses:
'no'
: The current browser supports the Language Detector API, but it can't be used at the moment. For example, because there isn't enough free disk space available to download the model.'readily'
: The current browser supports the Language Detector API, and it can be used right away.'after-download'
: The current browser supports the Language Detector API, but it needs to download the model first.
To trigger the download and instantiate the language detector, call the
asynchronous self.ai.languageDetector.create()
function. If the response to
capabilities()
was 'after-download'
, it's best practice to listen for download
progress, so you can inform the user in case the download takes time.
To see if a given language can be detected, call the languageAvailable()
function.
const languageDetectorCapabilities = await self.ai.languageDetector.capabilities();
languageDetectorCapabilities.languageAvailable('es');
// 'readily'
The following example demonstrates how to initialize the language detector.
const languageDetectorCapabilities = await self.ai.languageDetector.capabilities();
const canDetect = languageDetectorCapabilities.capabilities;
let detector;
if (canDetect === 'no') {
// The language detector isn't usable.
return;
}
if (canDetect === 'readily') {
// The language detector can immediately be used.
detector = await self.ai.languageDetector.create();
} else {
// The language detector can be used after model download.
detector = await self.ai.languageDetector.create({
monitor(m) {
m.addEventListener('downloadprogress', (e) => {
console.log(`Downloaded ${e.loaded} of ${e.total} bytes.`);
});
},
});
await detector.ready;
}
Run the language detector
The Language Detector API uses a ranking model to determine which language is most likely used in a given piece of text. Ranking is a type of machine learning, where the objective is to order a list of items. In this case, the Language Detector API ranks languages from highest to lowest probability.
The detect()
function can return either the first result, the likeliest
answer, or iterate over the ranked candidates with the level of confidence.
This is returned as a list of {detectedLanguage, confidence}
objects. The
confidence
level is expressed as a value between 0.0
(lowest confidence)
and 1.0
(highest confidence).
const someUserText \= 'Hallo und herzlich willkommen\!';
const results \= await detector.detect(someUserText);
for (const result of results) {
// Show the full list of potential languages with their likelihood, ranked
// from most likely to least likely. In practice, one would pick the top
// language(s) that cross a high enough threshold.
console.log(result.detectedLanguage, result.confidence);
}
// (Output truncated):
// de 0.9993835687637329
// en 0.00038279531872831285
// nl 0.00010798392031574622
// ...
Demo
Preview the Language Detector API in our demo. Enter text written in different languages in the textarea.
Standardization effort
The Language Detector API was moved to the W3C Web Incubator Community Group after the corresponding proposal received enough support. The API is part of a larger Translation API proposal.
The Chrome team requested feedback from the W3C Technical Architecture Group and asked Mozilla and WebKit for their standards positions.
Share your feedback
If you have feedback on Chrome's implementation, file a Chromium bug. Share your feedback on the API shape of the Language Detector API by commenting on an existing or open a new Issue in the Translation API GitHub repository.