The U.S. Defense Advanced Research Projects Agency ( DARPA) has come out with AI-based forensic tools to catch deepfakes, first reported by MIT technology review yesterday. According to MIT Technology Review, the development of more tools is currently under progress to expose fake images and revenge porn videos on the web. DARPA’s deepfake mission project was announced earlier this year.
As mentioned in the MediFor blog post, “While many manipulations are benign, performed for fun or for artistic value, others are for adversarial purposes, such as propaganda or misinformation campaigns”. This is one of the major reasons why DARPA Forensics experts are keen on finding methods to detect deepfakes videos and images
How did deepfakes originate?
Back in December 2017, a Reddit user named “DeepFakes” posted extremely real-looking explicit videos of celebrities. He used deep learning techniques to insert celebrities’ faces into adult movies. Using Deep learning, one can combine and superimpose existing images and videos onto original images or videos to create realistic-seeming fake videos.
As per the MIT technology review,“Video forgeries are done using a machine-learning technique — generative modeling — lets a computer learn from real data before producing fake examples that are statistically similar”. Video tampering is done using two neural networks — generative adversarial networks which work in conjunction “to produce ever more convincing fakes”.
Why are deepfakes toxic?
An app named FakeApp was released earlier this year which helped create deepfakes quite easily. FakeApp uses neural networking tools developed by Google’s AI division. The app trains itself to perform image-recognition tasks using trial and error. Ever since its release, the app has been downloaded more than 120,000 times.
In fact, there are tutorials online on how to create deepfakes. Apart from this, there are regular requests on deepfake forums, asking users for help in creating face-swap porn videos of ex-girlfriends, classmates, politicians, celebrities, and teachers. Deepfakes is even be used to create fake news such as world leaders declaring war on a country. The toxic potential of this technology has led to a growing concern as deepfakes have become a powerful tool for harassing people.
Once deepfakes found their way on the world wide web, many websites such as Twitter and PornHub, banned them from being posted on their platforms. Reddit also announced a ban on deepfakes, earlier this year, killing The “deepfakes” subreddit which had more than 90,000 subscribers, entirely.
MediFor: DARPA’s AI weapon to counter deepfakes
DARPA’s Media Forensics group, also known as MediFor, works in a group along with other researchers is set on developing AI tools for deepfakes. It is currently focusing on four techniques to catch the audiovisual discrepancies present in a forged video. This includes analyzing lip sync, detecting speaker inconsistency, scene inconsistency and content insertions.
One technique comes from a team led by Professor Siwei Lyu of SUNY Albany. Lyu mentioned that they “generated about 50 fake videos and tried a bunch of traditional forensics methods. They worked on and off, but not very well”. As the deepfakes are created using static images, Lyu noticed that that the faces in deepfakes videos rarely blink and that eye-movement, if present, is quite unnatural.
An academic paper titled “In Ictu Oculi: Exposing AI Generated Fake Face Videos by Detecting Eye Blinking,” by Yuezun Li, Ming-Ching Chang and Siwei Lyu explains a method to detect forged videos. It makes use of Long-term Recurrent Convolutional Networks (LRCN).
According to the research paper, people, on an average, blink about 17 times a minute or 0.283 times per second. This rate increases with conversation and decreases while reading. There are a lot of other techniques which are used for eye blink detection such as detecting the eye state by computing the vertical distance between eyelids, measuring eye aspect ratio ( EAR ), and using the convolutional neural network (CNN) to detect open and closed eye states.
But, Li, Chang, and Lyu use a different approach. They rely on Long-term Recurrent Convolutional Networks (LRCN) model. They first perform pre-processing to identify facial features and normalize the video frame orientation. Then, they pass cropped eye images into the LRCN for evaluation. This technique is quite effective. It is also better as compared to other approaches, with a reported accuracy of 0.99 (LRCN) compared to 0.98 (CNN) and 0.79 (EAR).
However, Lyu says that a skilled video editor can fix the non-blinking deepfakes by using images that shows blinking eyes. But, Lyu’s team has a secret effective technique in the works to fix even that, though he hasn’t divulged any details. Others in DARPA are on the look-out for similar cues such as strange head movements, odd eye color, etc as these little details are leading the team even closer to detection of deepfakes.
As mentioned in the MIT Technology review post, “the arrival of these forensics tools may simply signal the beginning of an AI-powered arms race between video forgers and digital sleuths” and how”. Also, MediFor states that “If successful, the MediFor platform will automatically detect manipulations, provide detailed information about how these manipulations were performed, and reason about the overall integrity of visual media to facilitate decisions regarding the use of any questionable image or video”.
Deepfakes need to stop and the U.S. Defense Advanced Research Projects Agency ( DARPA) seems all set to fight against them.