Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

2 min read

When last week, Janelle Shane, a Research Scientist in optics, fed the infamous rabbit-duck illusion example to the Google Cloud Vision API, it gave “rabbit” as a result. However, when the image was rotated at a different angle, the Google Cloud Vision API predicted a “duck”.

Inspired by this, Max Woolf, a data scientist at Buzzfeed, further tested and concluded that the result really varies based on the orientation of the image:

Google Cloud Vision provides pretrained API models that allow you to derive insights from input images. The API classifies images into thousands of categories, detects individual objects and faces within images, and reads printed words within images. You can also train custom vision models with AutoML Vision Beta.

Woolf used Python for rotating the image and get predictions from the API for each rotation. He built the animations with R, ggplot2, and gganimate. To render these animations he used ffmpeg. Many times, in deep learning, a model is trained using a strategy in which the input images are rotated to help the model better generalize. Seeing the results of the experiment, Woolf concluded, “I suppose the dataset for the Vision API didn’t do that as much / there may be an orientation bias of ducks/rabbits in the training datasets.”

The reaction to this experiment was pretty torn. While many Reddit users felt that there might be an orientation bias in the model, others felt that as the image is ambiguous there is no “right answer” and hence there is no problem with the model.

One of the Redditor said, “I think this shows how poorly many neural networks are at handling ambiguity.” Another Redditor commented, “This has nothing to do with a shortcoming of deep learning, failure to generalize, or something not being in the training set. It’s an optical illusion drawing meant to be visually ambiguous. Big surprise, it’s visually ambiguous to computer vision as well. There’s not ‘correct’ answer, it’s both a duck and a rabbit, that’s how it was drawn. The fact that the Cloud vision API can see both is actually a strength, not a shortcoming.”

Woolf has open-sourced the code used to generate this visualization on his GitHub page, which also includes CSV of the prediction results at every rotation. In case you are more curious, you can test the Cloud Vision API with the drag-and-drop UI provided by Google.

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Artificial Intelligence

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Servers

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Interviews

Clean Coding in Python with Mariano Anaya

Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Front-End Web Development

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Featured

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago

Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

Read Next

Recent Posts

Top life hacks for prepping for your IT certification exam

Learn Transformers for Natural Language Processing with Denis Rothman

Learning Essential Linux Commands for Navigating the Shell Effectively

Clean Coding in Python with Mariano Anaya

Exploring Forms in Angular – types, benefits and differences

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

Read Next

Related Post

Recent Posts

Top life hacks for prepping for your IT certification exam

Learn Transformers for Natural Language Processing with Denis Rothman

Learning Essential Linux Commands for Navigating the Shell Effectively

Clean Coding in Python with Mariano Anaya

Exploring Forms in Angular – types, benefits and differences

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Exploring Forms in Angular – types, benefits and differences