Recognize your users' handwriting

The Handwriting Recognition API allows you to recognize text from handwritten input as it happens.

What is the Handwriting Recognition API?

The Handwriting Recognition API allows you to convert handwriting (ink) from your users into text. Some operating systems have long included such APIs, and with this new capability, your web apps can finally use this functionality. The conversion takes place directly on the user's device, works even in offline mode, all without adding any third-party libraries or services.

This API implements so-called "on-line" or near real-time recognition. This means that the handwritten input is recognized while the user is drawing it by capturing and analyzing the single strokes. In contrast to "off-line" procedures such as Optical Character Recognition (OCR), where only the end product is known, on-line algorithms can provide a higher level of accuracy due to additional signals like the temporal sequence and pressure of individual ink strokes.

Suggested use cases for the Handwriting Recognition API

Example uses include:

  • Note-taking applications where users want to capture handwritten notes and have them translated into text.
  • Forms applications where users can use pen or finger input due to time constraints.
  • Games that require filling in letters or numbers, such as crosswords, hangman, or sudoku.

Current status

The Handwriting Recognition API is available from (Chromium 99).

How to use the Handwriting Recognition API

Feature detection

Detect browser support by checking for the existence of the createHandwritingRecognizer() method on the navigator object:

if ('createHandwritingRecognizer' in navigator) {
  // 🎉 The Handwriting Recognition API is supported!
}

Core concepts

The Handwriting Recognition API converts handwritten input into text, regardless of the input method (mouse, touch, pen). The API has four main entities:

  1. A point represents where the pointer was at a particular time.
  2. A stroke consists of one or more points. The recording of a stroke starts when the user puts the pointer down (i.e., clicks the primary mouse button, or touches the screen with their pen or finger) and ends when they raise the pointer back up.
  3. A drawing consists of one or more strokes. The actual recognition takes place at this level.
  4. The recognizer is configured with the expected input language. It is used to create an instance of a drawing with the recognizer configuration applied.

These concepts are implemented as specific interfaces and dictionaries, which I'll cover shortly.

The core entities of the Handwriting Recognition API: One or more points compose a stroke, one or more strokes compose a drawing, that the recognizer creates. The actual recognition takes place at the drawing level.

Creating a recognizer

To recognize text from handwritten input, you need to obtain an instance of a HandwritingRecognizer by calling navigator.createHandwritingRecognizer() and passing constraints to it. Constraints determine the handwriting recognition model that should be used. Currently, you can specify a list of languages in order of preference:

const recognizer = await navigator.createHandwritingRecognizer({
  languages: ['en'],
});

The method returns a promise resolving with an instance of a HandwritingRecognizer when the browser can fulfill your request. Otherwise, it will reject the promise with an error, and handwriting recognition will not be available. For this reason, you may want to query the recognizer's support for particular recognition features first.

Querying recognizer support

By calling navigator.queryHandwritingRecognizerSupport(), you can check if the target platform supports the handwriting recognition features you intend to use. In the following example, the developer:

  • wants to detect texts in English
  • get alternative, less likely predictions when available
  • gain access to the segmentation result, i.e., the recognized characters, including the points and strokes that make them up
const { languages, alternatives, segmentationResults } =
  await navigator.queryHandwritingRecognizerSupport({
    languages: ['en'],
    alternatives: true,
    segmentationResult: true,
  });

console.log(languages); // true or false
console.log(alternatives); // true or false
console.log(segmentationResult); // true or false

The method returns a promise resolving with a result object. If the browser supports the feature specified by the developer, its value will be set to true. Otherwise, it will be set to false. You can use this information to enable or disable certain features within your application, or to adjust your query and send a new one.

Start a drawing

Within your application, you should offer an input area where the user makes their handwritten entries. For performance reasons, it is recommended to implement this with the help of a canvas object. The exact implementation of this part is out of scope for this article, but you may refer to the demo to see how it can be done.

To start a new drawing, call the startDrawing() method on the recognizer. This method takes an object containing different hints to fine-tune the recognition algorithm. All hints are optional:

  • The kind of text being entered: text, email addresses, numbers, or an individual character (recognitionType)
  • The type of input device: mouse, touch, or pen input (inputType)
  • The preceding text (textContext)
  • The number of less-likely alternative predictions that should be returned (alternatives)
  • A list of user-identifiable characters ("graphemes") the user will most likely enter (graphemeSet)

The Handwriting Recognition API plays well with Pointer Events which provide an abstract interface to consume input from any pointing device. The pointer event arguments contain the type of pointer being used. This means you can use pointer events to determine the input type automatically. In the following example, the drawing for handwriting recognition is automatically created on the first occurrence of a pointerdown event on the handwriting area. As the pointerType may be empty or set to a proprietary value, I introduced a consistency check to make sure only supported values are set for the drawing's input type.

let drawing;
let activeStroke;

canvas.addEventListener('pointerdown', (event) => {
  if (!drawing) {
    drawing = recognizer.startDrawing({
      recognitionType: 'text', // email, number, per-character
      inputType: ['mouse', 'touch', 'pen'].find((type) => type === event.pointerType),
      textContext: 'Hello, ',
      alternatives: 2,
      graphemeSet: ['f', 'i', 'z', 'b', 'u'], // for a fizz buzz entry form
    });
  }
  startStroke(event);
});

Add a stroke

The pointerdown event is also the right place to start a new stroke. To do so, create a new instance of HandwritingStroke. Also, you should store the current time as a point of reference for the subsequent points added to it:

function startStroke(event) {
  activeStroke = {
    stroke: new HandwritingStroke(),
    startTime: Date.now(),
  };
  addPoint(event);
}

Add a point

After creating the stroke, you should directly add the first point to it. As you will add more points later on, it makes sense to implement the point creation logic in a separate method. In the following example, the addPoint() method calculates the elapsed time from the reference timestamp. The temporal information is optional, but can improve recognition quality. Then, it reads the X and Y coordinates from the pointer event and adds the point to the current stroke.

function addPoint(event) {
  const timeElapsed = Date.now() - activeStroke.startTime;
  activeStroke.stroke.addPoint({
    x: event.offsetX,
    y: event.offsetY,
    t: timeElapsed,
  });
}

The pointermove event handler is called when the pointer is moved across the screen. Those points need to be added to the stroke as well. The event can also be raised if the pointer is not in a "down" state, for example when moving the cursor across the screen without pressing the mouse button. The event handler from the following example checks if an active stroke exists, and adds the new point to it.

canvas.addEventListener('pointermove', (event) => {
  if (activeStroke) {
    addPoint(event);
  }
});

Recognize text

When the user lifts the pointer again, you can add the stroke to your drawing by calling its addStroke() method. The following example also resets the activeStroke, so the pointermove handler will not add points to the completed stroke.

Next, it's time for recognizing the user's input by calling the getPrediction() method on the drawing. Recognition usually takes less than a few hundred milliseconds, so you can repeatedly run predictions if needed. The following example runs a new prediction after each completed stroke.

canvas.addEventListener('pointerup', async (event) => {
  drawing.addStroke(activeStroke.stroke);
  activeStroke = null;

  const [mostLikelyPrediction, ...lessLikelyAlternatives] = await drawing.getPrediction();
  if (mostLikelyPrediction) {
    console.log(mostLikelyPrediction.text);
  }
  lessLikelyAlternatives?.forEach((alternative) => console.log(alternative.text));
});

This method returns a promise which resolves with an array of predictions ordered by their likelihood. The number of elements depends on the value you passed to the alternatives hint. You could use this array to present the user with a choice of possible matches, and have them select an option. Alternatively, you can simply go with the most likely prediction, which is what I do in the example.

The prediction object contains the recognized text and an optional segmentation result, which I will discuss in the following section.

Detailed insights with segmentation results

If supported by the target platform, the prediction object can also contain a segmentation result. This is an array containing all recognized handwriting segment, a combination of the recognized user-identifiable character (grapheme) along with its position in the recognized text (beginIndex, endIndex), and the strokes and points that created it.

if (mostLikelyPrediction.segmentationResult) {
  mostLikelyPrediction.segmentationResult.forEach(
    ({ grapheme, beginIndex, endIndex, drawingSegments }) => {
      console.log(grapheme, beginIndex, endIndex);
      drawingSegments.forEach(({ strokeIndex, beginPointIndex, endPointIndex }) => {
        console.log(strokeIndex, beginPointIndex, endPointIndex);
      });
    },
  );
}

You could use this information to track down the recognized graphemes on the canvas again.

Boxes are drawn around each recognized grapheme

Complete recognition

After the recognition has completed, you can free resources by calling the clear() method on the HandwritingDrawing, and the finish() method on the HandwritingRecognizer:

drawing.clear();
recognizer.finish();

Demo

The web component <handwriting-textarea> implements a progressively enhanced, editing control capable of handwriting recognition. By clicking the button in the lower right corner of the editing control, you activate the drawing mode. When you complete the drawing, the web component will automatically start the recognition and add the recognized text back to the editing control. If the Handwriting Recognition API is not supported at all, or the platform doesn't support the requested features, the edit button will be hidden. But the basic editing control remains usable as a <textarea>.

The web component offers properties and attributes to define the recognition behavior from the outside, including languages and recognitiontype. You can set the content of the control via the value attribute:

<handwriting-textarea languages="en" recognitiontype="text" value="Hello"></handwriting-textarea>

To be informed about any changes to the value, you can listen to the input event.

You can try the component using this demo on Glitch. Also be sure to have a look at the source code. To use the control in your application, obtain it from npm.

Security and permissions

The Chromium team designed and implemented the Handwriting Recognition API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics.

User control

The Handwriting Recognition API can't be turned off by the user. It is only available for websites delivered via HTTPS, and may only be called from the top-level browsing context.

Transparency

There is no indication if handwriting recognition is active. To prevent fingerprinting, the browser implements countermeasures, such as displaying a permission prompt to the user when it detects possible abuse.

Permission persistence

The Handwriting Recognition API currently does not show any permissions prompts. Thus, permission does not need to be persisted in any way.

Feedback

The Chromium team wants to hear about your experiences with the Handwriting Recognition API.

Tell us about the API design

Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue.

Report a problem with the implementation

Did you find a bug with Chromium's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>Handwriting in the Components box. Glitch works great for sharing quick and easy repros.

Show support for the API

Are you planning to use the Handwriting Recognition API? Your public support helps the Chromium team prioritize features and shows other browser vendors how critical it is to support them.

Share how you plan to use it on the WICG Discourse thread. Send a tweet to @ChromiumDev using the hashtag #HandwritingRecognition and let us know where and how you're using it.

Acknowledgements

This article was reviewed by Joe Medley, Honglin Yu and Jiewei Qian. Hero image by Samir Bouaked on Unsplash.