An individually trained semantic decoder translated brain patterns on functional MRI (fMRI) into continuous streams of text, a small study showed.
The customized decoder used fMRI responses to generate text describing an individual's imagined stories, the contents of podcasts they listened to, and silent videos they watched, reported Alexander Huth, PhD, of the University of Texas at Austin, and co-authors in .
The decoder predictions often recovered the gist of a person's internal speech, but the result was not a verbatim transcript. About half the time, the decoder produced text that closely or precisely matched the intended meanings of the words a person was thinking.
"For a noninvasive method, this is a real leap forward compared to what's been done before, which is typically single words or short sentences," Huth said in a press briefing. "We're getting the model to decode continuous language for extended periods of time with complicated ideas."
The system requires extensive training to work, Huth noted. "A person needs to spend up to 15 hours lying in an MRI scanner, being perfectly still, and paying good attention to stories that they're listening to before this really works well on them," he said.
The goal of language decoding is to use brain activity recordings to predict the words someone is hearing, saying, or imagining, said co-author Jerry Tang, PhD candidate at the University of Texas at Austin. "Eventually, we hope this technology can help people who lost the ability to speak due to injuries like stroke or diseases like ALS," he stated.
"Our study is the first to decode continuous language -- meaning more than single words or sentences -- from noninvasive brain recordings," Tang added. Currently, decoding language from neural activity relies mainly on invasive brain-computer interfaces that require surgical implants.
The researchers trained the decoder by recording fMRI data from three study participants -- two men and one woman in their 20s and 30s -- as they listened to narrative stories from radio shows and podcasts like "The Moth Radio Hour" and "Modern Love" for 16 hours.
"We used this dataset to build a model that takes in any sequence of words and predicts how the user's brain would respond when hearing those words," Tang said. The language system used was GPT-1, an early version of the model behind ChatGPT.
Each decoder model analyzed brain responses as participants listened to new stories that were not part of the training dataset. The decoder generated word sequences that captured the meanings of the new stories, including some exact words and phrases from the stories. Most timepoints in the story (72%-82%) had a significantly higher BERTscore -- a metric that tests the quality of text-generation systems -- than expected by chance.
The researchers tested the decoder on people it hadn't been trained on and found the results were unintelligible. If participants put up resistance, results also were unusable.
But that doesn't mean fMRI recordings couldn't be used against people one day, noted Nita Farahany, JD, PhD, of Duke University in Durham, North Carolina.
"This research illustrates the rapid advances being made toward an age of much greater brain transparency, where even continuous language and semantic meaning can be decoded from the brain," Farahany told Ƶ.
"While people can employ effective countermeasures to prevent decoding their brains using fMRI, as brain wearables become widespread that may not be an effective way to protect us from interception, manipulation, or even punishment for our thoughts," Farahany observed.
Policies to protect mental privacy may be needed as technology evolves, Tang stated. "I think right now, while the technology is in such an early state, it's important to be proactive by enacting policies that protect people and their privacy," he said. "Regulating what these devices can be used for is also very important."
Currently, the system isn't practical to use routinely because it relies on fMRI, but it's possible the technology could work with portable brain-imaging systems like functional near-infrared spectroscopy (fNIRS), Huth observed.
"fNIRS measures where there's more or less blood flow in the brain at different points in time, which, it turns out, is exactly the same kind of signal that fMRI is measuring," Huth said. "So, our exact kind of approach should translate to fNIRS," but the resolution with fNIRS would be lower.
Disclosures
The study was supported by the National Institute on Deafness and Other Communication Disorders, the Whitehall Foundation, the Alfred P. Sloan Foundation, and the Burroughs Wellcome Fund.
Huth and Tang are inventors on a pending patent application (the applicant is the University of Texas System) that is directly relevant to the language decoding approach used in this work. All other authors declared no competing interests.
Primary Source
Nature Neuroscience
Tang J, et al "Semantic reconstruction of continuous language from non-invasive brain recordings" Nat Neurosci 2023; DOI: 10.1038/s41593-023-01304-9.