2 min read

Google announced last week that it has improved the handwriting recognition feature in Gboard, Google’s popular keyboard for mobile devices, as it is quite fast and makes 20%-40% fewer mistakes than before.

It was last year when Google added support for handwriting recognition in Gboard for Android that supported more than 100 languages. Also, advancements in Machine Learning allowed Google to come out with new model architectures and training methodologies.

Google made changes to its initial approach that relied on hand-designed heuristics to build a single machine learning model. This machine learning model operates on the whole input and reduces error rates significantly as compared to the old version.

Google also published a paper titled “Fast Multi-language LSTM-based Online Handwriting Recognition” explaining its research regarding online handwriting recognition.

Google team states that since Gboard is used on a range of devices and screen resolutions, their first measure involves normalizing the touch-point coordinates. Then, the team converts the sequence of points into a sequence of cubic Bézier curves, which are then further used as inputs to a recurrent neural network (RNN). This RNN is trained to accurately identify the character being written. Bézier curves provide a consistent representation of the input across devices consisting of different sampling rates and accuracies.

Another benefit is that the sequence of Bézier curves is way more compact than the underlying sequence of input points. This makes it easier for the model to pick up temporal dependencies along the input.

Now, although the sequence of curves represents the input, there is still a need for the researchers to translate the sequence of input curves into the actual written characters. Hence, a multi-layer RNN is used in order to process the sequence of curves and produce an output decoding matrix.

Researchers settled on using a bidirectional version of Quasi-recurrent neural networks (QRNN). QRNNs alternate between convolutional and recurrent layers, and offers good predictive performance. Additionally, in order to “decode” the curves, RNN produces a matrix, where each column corresponds to one input curve, and each row corresponds to a letter in the alphabet

The QRNN-based recognizer converts the curves’ sequence into character sequence probabilities of the same length. Also, to offer the best user-experience, accurate recognition models are not enough. This is why researchers have converted their recognition models (trained in TensorFlow) to TensorFlow Lite models.

“We will continue to push the envelope beyond improving the Latin-script language recognizers. The Handwriting Team is already hard at work launching new models for all our supported handwriting languages in Gboard”, states the Google team.

For more information, check out the official Google AI blog.

Read Next

Google Cloud security launches three new services for better threat detection and protection in enterprises

Google releases a fix for the zero day vulnerability in its Chrome browser while it was under active attack

Google open-sources GPipe, a pipeline parallelism Library to scale up Deep Neural Network training

Tech writer at the Packt Hub. Dreamer, book nerd, lover of scented candles, karaoke, and Gilmore Girls.