AutoAugment: Google’s research initiative to improve deep learning performance

4 min read

Deep learning and artificial intelligence implement cognitive abilities to build specialized solutions to solve a range of problems. With growing innovations, artificial intelligence field is practically exploding. Deep learning has already shown the mettle in handling all shapes and forms of data such as text, images, video, audio, social interaction and more.

There are many existing vendors such as Google, Microsoft, Amazon, and IBM constantly working towards bringing AI within an organization by providing a range of services. However,  Google no doubt is doubling down on its research for their existing deep learning techniques. The company’s latest research, AutoAugment: Learning Augmentation Policies from Data, involves a reinforcement learning algorithm to increase both the amount and the variety of data in an existing training dataset.

What is AutoAugment?

AutoAugment is a new latest research paper by the Google team to tackle one of the biggest hurdle faced in deep learning i.e.a huge amount of quality data available to train models. This technique finds ways to automatically augment existing data with machine learning principles.

This research paper uses a procedure called data augmentation, specifically used for images that help in finding the improved data augmentation policies. The idea is creating a search space of data augmentation policies, evaluating the quality of each policy directly on the dataset. The researchers have created a search space, where each policy consist of many sub-policies. Each policy can be randomly chosen for each image in each mini-batch.

A sub-policy further consists of two set of operations. Each operation is an image processing function. A search algorithm is used to find the best policy so that the neural network model provides the highest validation accuracy on large datasets.

Why AutoAugment?

One of the core reasons why deep learning is doing exceptionally well in computer vision is the availability of large amounts of labeled training data. A model’s performance improves as you increase the quality and the amount of training data. However, collecting quality data in order to train a model for the optimized result is a difficult task.

A possible way to deal with this issue is to hardcode image symmetries into neural network architectures in order to provide optimized results. Or researchers and developers manually design data augmentation techniques such as rotation and flipping, that are extensively used to train computer vision models. However, this can be time-consuming and tedious.

Now, imagine a technique which automatically augments existing data using machine learning? Google team took inspiration from the results of AutoML research which were used to build neural network architectures and optimizers to replace components of traditional systems designed by humans. They thought of doing the same to automate the procedure of data augmentation.

Data augmentation has ensured improved performance by training the model about image invariances (images have many symmetries that don’t change the information present in the image) in the data domain in a way that makes a neural network unchanged to these important symmetries. The traditional deep learning models use human-designed data augmentation policies. While this technique uses reinforcement learning algorithm to find the optimal image transformation policies from the data itself. It improves the performance of computer vision models to a great extent.

Advantages of using AutoAugment

  • Using AutoAugment will automatically design custom data augmentation policies for computer vision datasets. Hence, it will select the basic image transformation operations such as flipping the image horizontally or vertically, changing the color of the image, and more. This technique automatically predicts which image transformations to combine. It also predicts the per-image probability and magnitude of the transformation used, so that the image is not always worked around in the same way.
  • It automatically learns different transformations based on the dataset used.
  • Using AutoAugment algorithm has ensured better augmentation policies for some of the most widely used computer vision datasets. It additionally led to better accuracy when incorporated into the training of the neural network.
  • AutoAugment achieves a new state-of-the-art accuracy of 83.54% when augmenting ImageNet data.
  • On CIFAR10, the error rate of 1.48% is achieved, which is 0.83% value improvement over the traditional data augmentation. Further an improved state-of-the-art error rate from 1.30% to 1.02% was achieved on the street view of house numbers (SVHN) dataset.
  • Most importantly, you can transfer AutoAugment policies. Hence, the policy used for the ImageNet dataset can also be applied to other datasets, ultimately improving neural network performance.

AutoAugment technique has shown good signs in achieving a good level of performance on popular computer vision datasets. It will continue to work across more computer vision tasks and even in other domains such as audio processing or language models. You can refer to the research paper here, to apply them to improve your model performance on relevant computer vision tasks. For complete detailed information, visit the official Google blog.

Read more


Please enter your comment!
Please enter your name here