Researchers at Columbia University have carried out a successful experiment where they translated brain activity into words using deep learning and speech synthesizer. They made use of the Auditory stimulus reconstruction technique which combines the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex.
They temporarily placed five electrodes in the brains of five people who were about to undergo a brain surgery for epilepsy. These five were asked to listen to recordings of sentences, and their brain activity was used to train deep-learning-based speech recognition software. Post this they were made to listen to 40 numbers being spoken.
The AI then tried decoding what they heard based on the brain activity and spoke out the results in a robotic voice. According to the ones who heard the robot voice, the voice synthesizer produced was understandable as the right word 75% of the time.
According to the Technology Review, “At the moment the technology can only reproduce words that these five patients have heard—and it wouldn’t work on anyone else.” However, the researchers believe that such a technology could help people who have been paralyzed communicate with their family and friends, despite losing the ability to speak.
Dr. Nima Mesgarani, an associate professor at Columbia University, said “One of the motivations of this work…is for alternative human-computer interaction methods, such as a possible interface between a user and a smartphone.”
According to the report, “Our approach takes a step toward the next generation of human-computer interaction systems and more natural communication channels for patients suffering from paralysis and locked-in syndromes.”
To know more about this experiment, head over to the complete report.