Skip to main content

speech

Can a Mind-Reading Computer Speak for Those Who Cannot?

Posted on by

Credit: Adapted from Nima Mesgarani, Columbia University’s Zuckerman Institute, New York

Computers have learned to do some amazing things, from beating the world’s ranking chess masters to providing the equivalent of feeling in prosthetic limbs. Now, as heard in this brief audio clip counting from zero to nine, an NIH-supported team has combined innovative speech synthesis technology and artificial intelligence to teach a computer to read a person’s thoughts and translate them into intelligible speech.

Turning brain waves into speech isn’t just fascinating science. It might also prove life changing for people who have lost the ability to speak from conditions such as amyotrophic lateral sclerosis (ALS) or a debilitating stroke.

When people speak or even think about talking, their brains fire off distinctive, but previously poorly decoded, patterns of neural activity. Nima Mesgarani and his team at Columbia University’s Zuckerman Institute, New York, wanted to learn how to decode this neural activity.

Mesgarani and his team started out with a vocoder, a voice synthesizer that produces sounds based on an analysis of speech. It’s the very same technology used by Amazon’s Alexa, Apple’s Siri, or other similar devices to listen and respond appropriately to everyday commands.

As reported in Scientific Reports, the first task was to train a vocoder to produce synthesized sounds in response to brain waves instead of speech [1]. To do it, Mesgarani teamed up with neurosurgeon Ashesh Mehta, Hofstra Northwell School of Medicine, Manhasset, NY, who frequently performs brain mapping in people with epilepsy to pinpoint the sources of seizures before performing surgery to remove them.

In five patients already undergoing brain mapping, the researchers monitored activity in the auditory cortex, where the brain processes sound. The patients listened to recordings of short stories read by four speakers. In the first test, eight different sentences were repeated multiple times. In the next test, participants heard four new speakers repeat numbers from zero to nine.

From these exercises, the researchers reconstructed the words that people heard from their brain activity alone. Then the researchers tried various methods to reproduce intelligible speech from the recorded brain activity. They found it worked best to combine the vocoder technology with a form of computer artificial intelligence known as deep learning.

Deep learning is inspired by how our own brain’s neural networks process information, learning to focus on some details but not others. In deep learning, computers look for patterns in data. As they begin to “see” complex relationships, some connections in the network are strengthened while others are weakened.

In this case, the researchers used the deep learning networks to interpret the sounds produced by the vocoder in response to the brain activity patterns. When the vocoder-produced sounds were processed and “cleaned up” by those neural networks, it made the reconstructed sounds easier for a listener to understand as recognizable words, though this first attempt still sounds pretty robotic.

The researchers will continue testing their system with more complicated words and sentences. They also want to run the same tests on brain activity, comparing what happens when a person speaks or just imagines speaking. They ultimately envision an implant, similar to those already worn by some patients with epilepsy, that will translate a person’s thoughts into spoken words. That might open up all sorts of awkward moments if some of those thoughts weren’t intended for transmission!

Along with recently highlighted new ways to catch irregular heartbeats and cervical cancers, it’s yet another remarkable example of the many ways in which computers and artificial intelligence promise to transform the future of medicine.

Reference:

[1] Towards reconstructing intelligible speech from the human auditory cortex. Akbari H, Khalighinejad B, Herrero JL, Mehta AD, Mesgarani N. Sci Rep. 2019 Jan 29;9(1):874.

Links:

Advances in Neuroprosthetic Learning and Control. Carmena JM. PLoS Biol. 2013;11(5):e1001561.

Nima Mesgarani (Columbia University, New York)

NIH Support: National Institute on Deafness and Other Communication Disorders; National Institute of Mental Health


How the Brain Regulates Vocal Pitch

Posted on by

Credit: University of California, San Francisco

Whether it’s hitting a high note, delivering a punch line, or reading a bedtime story, the pitch of our voices is a vital part of human communication. Now, as part of their ongoing quest to produce a dynamic picture of neural function in real time, researchers funded by the NIH’s Brain Research through Advancing Innovative Neurotechnologies (BRAIN) Initiative have identified the part of the brain that controls vocal pitch [1].

This improved understanding of how the human brain regulates the pitch of sounds emanating from the voice box, or larynx, is more than cool neuroscience. It could aid in the development of new, more natural-sounding technologies to assist people who have speech disorders or who’ve had their larynxes removed due to injury or disease.


Creative Minds: A Baby’s Eye View of Language Development

Posted on by

Click to start videoIf you are a fan of wildlife shows, you’ve probably seen those tiny video cameras rigged to animals in the wild that provide a sneak peek into their secret domains. But not all research cams are mounted on creatures with fur, feathers, or fins. One of NIH’s 2014 Early Independence Award winners has developed a baby-friendly, head-mounted camera system (shown above) that captures the world from an infant’s perspective and explores one of our most human, but still imperfectly understood, traits: language.

Elika Bergelson

Elika Bergelson
Credit: Zachary T. Kern

Elika Bergelson, a young researcher at the University of Rochester in New York, wants to know exactly how and when infants acquire the ability to understand spoken words. Using innovative camera gear and other investigative tools, she hopes to refine current thinking about the natural timeline for language acquisition. Bergelson also hopes her work will pay off in a firmer theoretical foundation to help clinicians assess children with poor verbal skills or with neurodevelopmental conditions that impair information processing, such as autism spectrum disorders.


The Science of Stuttering

Posted on by

VP Biden: Portrait shoot by Andrew "Andy" Cutraro. 459 EEOB Studio

Credit: White House

Stuttering is a speech disorder that’s affected some very famous people, including King George VI, actress Marilyn Monroe, and, believe it or not, even Vice President Joe Biden.

About 5% of children stutter, but many like the Vice President outgrow the disorder.

About 1% of adults stutter. That’s about 3 million people in the United States and 60 million worldwide.

Until recently, the cause of most stuttering was a mystery. However, researchers at the NIH’s National Institute on Deafness and Other Communication Disorders have identified several genes involved in inherited forms of stuttering and are busy looking for additional clues that may open new avenues for treatment. Find out more about what science is doing to help.