3 min read

Google recently introduced NIMA (Neural Image Enhancement Model), a deep convolutional neural network.

NIMA is trained to predict which images would be considered would be considered technically or aesthetically attractive by a user.  It is able to generalize objects based on their categories despite many variations, similar to object recognition networks. It can be used to score images reliably with high correlation to human perception, and also in other labor-intensive and subjective tasks such as intelligent photo editing, optimizing visual quality for an improvised user engagement, or to minimize perceived visual errors within an imaging pipeline.

Assessment of image quality and aesthetics has been a persistent issue in the field of image processing and computer vision. image quality assessment deals with measuring pixel-level degradations such as noise, blur, compression artifacts, etc., whereas the aesthetic assessment captures semantic level characteristics associated with emotions and beauty in images.

In recent times, Deep CNNs, which are trained using human-labeled data have been used to detect the subjective nature of image quality for some specific classes of images, for instance, landscapes. But, such an approach is limited as it categorizes images into two classes, high and low namely. On the contrary, what NIMA does is, it predicts the distribution of ratings. This further leads to a high-quality prediction with a higher correlation to the ground truth ratings. Also, it can be applied to all the images in general instead of only to some specific ones.

Let’s explore some applications of the NIMA model:

  • Distribution of ratings

Instead of classifying images as a low/high score or regressing to the mean score, the NIMA model produces a distribution of ratings for any given image, on a scale of 1 to 10, with 10 being the highest aesthetic score associated to an image.

It assigns likelihoods to each of the possible scores, which is more directly in line with how training data is typically captured. Hence, it turns out to be a better predictor of human preferences when measured against other approaches.

  • Ranking photos aesthetically

Various functions of the NIMA vector score–such as mean–can be used to rank photos aesthetically. Some test photos from the large-scale database for Aesthetic Visual Analysis (AVA) dataset are taken into account, where each AVA photo is scored by an average of 200 people in response to photography contests. After training on the aesthetic ranking of these photos, the NIMA model closely matched the mean scores given by human raters. So, NIMA is highly likely to perform equally well on other datasets, with predicted quality scores close to human ratings.

  • NIMA scoring for detecting quality of an image

NIMA scores can also be used to differentiate between the quality of images which have the same subject but may have been distorted in others ways. For instance, the mean scores that are predicted, are used to qualitatively rank photos as shown in the figure below. These images are part of the TID2013 test set, which contains various types and levels of distortions.

Source: https://arxiv.org/pdf/1709.05424.pdf

  • Perceptual Image Enhancement

Quality and aesthetic scores are used to perceptually tune image enhancement operators. In other words, maximizing NIMA score as part of a loss function can increase the likelihood of enhancing perceptual quality–the ability to interpret something through human senses–of an image. NIMA can be used as a training loss to tune a tone enhancement algorithm. The baseline aesthetic ratings can be improved by contrast adjustments directed by the NIMA score. Also, the NIMA model is able to guide a deep CNN filter to aesthetically find near-optimal settings of its parameters, such as brightness, highlights, and shadows.

To summarize, with NIMA, Google suggests that the quality assessment models that are based on ML may be capable of a wider range of useful functions. For instance, an improved image capture, the ability to sort out best pictures out of many, and so on.

For a deeper understanding of the workings of NIMA, you can go through the research paper.


Subscribe to the weekly Packt Hub newsletter. We'll send you this year's Skill Up Developer Skills Report.

* indicates required


Please enter your comment!
Please enter your name here