2 min read

Researchers at Columbia University have carried out a successful experiment where they translated brain activity into words using deep learning and speech synthesizer. They made use of the Auditory stimulus reconstruction technique which combines the recent advances in deep learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex.

They temporarily placed five electrodes in the brains of five people who were about to undergo a brain surgery for epilepsy. These five were asked to listen to recordings of sentences, and their brain activity was used to train deep-learning-based speech recognition software. Post this they were made to listen to 40 numbers being spoken.

The AI then tried decoding what they heard based on the brain activity and spoke out the results in a robotic voice. According to the ones who heard the robot voice, the voice synthesizer produced was understandable as the right word 75% of the time.


Source: Nature.com

According to the Technology Review, “At the moment the technology can only reproduce words that these five patients have heard—and it wouldn’t work on anyone else.” However, the researchers believe  that such a technology could help people who have been paralyzed communicate with their family and friends, despite losing the ability to speak.

Dr. Nima Mesgarani, an associate professor at Columbia University, said “One of the motivations of this work…is for alternative human-computer interaction methods, such as a possible interface between a user and a smartphone.”

According to the report, “Our approach takes a step toward the next generation of human-computer interaction systems and more natural communication channels for patients suffering from paralysis and locked-in syndromes.”

To know more about this experiment, head over to the complete report.

Read Next

Using deep learning methods to detect malware in Android Applications

Researchers introduce a deep learning method that converts mono audio recordings into 3D sounds using video scenes

IEEE Computer Society predicts top ten tech trends for 2019: assisted transportation, chatbots, and deep learning accelerators among others

Subscribe to the weekly Packt Hub newsletter. We'll send you the results of our AI Now Survey, featuring data and insights from across the tech landscape.

A Data science fanatic. Loves to be updated with the tech happenings around the globe. Loves singing and composing songs. Believes in putting the art in smart.