The devices transmit signals from speech-related regions in the brain to state-of-the-art software. That software decodes brain activity and converts it to text displayed on a computer screen.
Pat Bennett, 68, received a diagnosis of amyotrophic lateral sclerosis (ALS) in 2012. The progressive neurodegenerative disease attacks neurons controlling movement, causing physical weakness and eventual paralysis. Bennett’s condition led to the loss of the ability to speak intelligibly as the condition’s deterioration began in the brain stem. However, she was able to use the sensors to communicate to the software.
In March of last year, a Stanford Medicine neurosurgeon placed two tiny sensors apiece in two regions along the surface of Bennett’s brain. The sensors are components of an intracortical brain-computer interface (BCI). Combined with the decoding software, they translate the brain activity accompanying attempts at speech into words on the screen.
The research team began twice-weekly sessions to train the software used for Bennett about a month after her surgery. After four months, it converted her attempted speech into words on a computer screen at 62 words per minute. According to Stanford, that’s more than three times as fast as the previous record for BCI-assisted communication.
“These initial results have proven the concept, and eventually technology will catch up to make it easily accessible to people who cannot speak,” Bennett said (through the speech translation method). “For those who are nonverbal, this means they can stay connected to the bigger world, perhaps continue to work, maintain friends and family relationships.”
About the sensors and how they work
Dr. Jaimie Henderson, who performed the surgery, said Bennett’s pace begins to approach the rate of natural conversation for English speakers — 160 words per minute.
“We’ve shown you can decode intended speech by recording activity from a very small area on the brain’s surface,” Henderson said.
The sensors implanted in Bennett’s cerebral cortex are square arrays of tiny silicon electrodes. Each array contains 64 electrodes arranged in 8-by-8 grids. The spacing between them constitutes about half the thickness of a credit card, according to Stanford. The electrodes penetrate the cerebral cortex to a depth roughly equaling two stacked quarters.
Gold wires attached to the arrays exit through pedestals screwed to the skull, then hooked up by cable to a computer. An AI algorithm receives and decodes electronic information from Bennett’s brain, teaching itself to distinguish the activity associated with formulating speech. It then delivers the “best guess” concerning the sequence of her attempted phonemes into a language model.
Stanford calls this model a “sophisticated autocorrect system,” which converts the streams into the sequence of words they represent.
“This system is trained to know what words should come before other ones, and which phonemes make what words,” Willett explained. “If some phonemes were wrongly interpreted, it can still take a good guess.”
Delivering speech through AI
Bennett engaged in about 25 training sessions to teach the algorithm how to recognize brain activity patterns associated with certain phonemes. Each session lasted about four hours as she attempted to repeat sentences chosen randomly from a large dataset of conversation samples. Bennett repeated 260 to 480 sentences per session and the system continued to improve as it became more familiar with her brain activity.
The BCI demonstrated an error rate of 9.1% when researchers restricted the model to a 50-word vocabulary. When they expanded it to 125,000 words, the rate rose to 23.8%. They said that, while “far from perfect,” this proves a giant step from the previous state of the art.
“Imagine,” Bennett says, “how different conducting everyday activities like shopping, attending appointments, ordering food, going into a bank, talking on a phone, expressing love or appreciation — even arguing — will be when nonverbal people can communicate their thoughts in real-time.”