Rutronik News

What are you thinking about, Honey?

  Newsletter Article

“You can only read people’s faces and not their minds,” as the saying goes. That is generally a good thing. At Rutronik24, we don’t always want to know what colleagues think of us when we take the last piece of cake from the plate yet again. On the other hand, if brain signals could be converted into speech, it would be beneficial for people who can no longer vocalize their thoughts due to diseases. A team from Columbia University in New York City has moved another step closer to achieving this aim.

For a long time, neuroscientists have been attempting find a direct interface to people’s thoughts by reading their brain signals. When we hear or speak words, it generates characteristic patterns of activity in the brain. Researchers have already successfully reconstructed letters that have been read and even spoken using brain signals. So far, it has not proved possible to convert these signals into comprehensible speech. The approaches have mostly used simple computer models to analyze acoustic spectrograms and assign them to the right brain signals.

The team from Columbia University wanted to increase the comprehensibility of the reconstructed speech by linking deep learning with innovations in speech synthesis. In their trial, the researchers combined a speech generation system, a so-called vocoder, with an artificial neural network that converts brain signals into speech. The test subjects were five epilepsy patients. Their electrodes, which were actually implanted for a different purpose, were able to pick up signals from the auditory cortex. ‘We discovered that people were able to understand and repeat 75 percent of the sounds, which is way beyond the level of any previous trial,” says Nima Mesgarani, who is leading the study at Columbia University.

To do this, the speech generation system was initially trained – with the same technology used by Amazon’s Echo or Apple’s Siri. However, the training data in the study did not consist of speech, but rather the brain signals of the patients in reaction to words that were spoken to them. This is how the system was supposed to learn how to identify the patterns that are formed in the brain when certain sounds are heard and the brain signals assign them to certain speech sounds.

In order to test whether the human speech generated in this way could be understood, the scientists got their AI to read the neural signals for the numbers one to nine and convert them into speech – once using the conventional spectrogram technology and once using the speech generation system, the vocoder. Another test was based on short sentences. The researchers played the sound files to eleven volunteers, who had to repeat the right numbers and evaluate the comprehensibility of the speech reconstruction. The result: The combination of the neural network and the vocoders achieved much better results than the two other technologies; in three quarters of the cases, the test subjects understood the right number. What’s more, four times out of five they could tell whether the person was a man or a woman.

Mesgarani hopes that the insights gained could one day lead to people who have no speech (or no longer have speech) only having to think “I need a glass of water” and the system will translate these thoughts into speech. Will it work? Niels Birbaumer, from the University of Tübingen, is skeptical: “Basically, the researchers have only recorded the brain’s reaction to an external stimulus,” which has been possible with other stimuli since electroencephalography (EEG) was first used in 1929. It is likely that several thousand electrodes in the brain would be needed to actually convert thoughts into speech.

The idea of people without the power of speech being able to communicate directly is a future vision that creates optimism. On the other hand, the question arises as to whether players with sinister intentions could also exploit the technology. It would certainly invalidate the notion that “thought is free,” as expressed in the traditional German folk song on freedom of thought, whose lyrics were first published on leaflets around 1780 and have been quoted ever since to voice a desire for freedom and independence in times of political oppression.

We are still a long way away from really being able to read people’s minds with the aid of artificial intelligence – so let’s hope that in the meantime the foundations are successfully laid for ensuring that this technology is not used indiscriminately. After all, here at Rutronik24, we don’t necessarily always want our partners to know what we are thinking about.