Advertisement

Researching the brain’s audio processing

Researching the brain’s audio processing

Up to 256 sensors are placed on the brain’s outer surface

There has been quite a bit of coverage in the popular press about University of California, San Francisco researchers solving the mystery of how selective hearing works — how people can tune in to a single speaker while tuning out their crowded, noisy environs. But much of the interesting technical details were not covered.

To understand how selective hearing works in the brain, UCSF neurosurgeon Dr. Edward Chang, a faculty member in the UCSF Department of Neurological Surgery and the Keck Center for Integrative Neuroscience, and UCSF postdoctoral fellow Nima Mesgarani worked with three patients who were undergoing brain surgery for severe epilepsy. The UCSF epilepsy team pinpointed the parts of the brain responsible for disabling seizures by mapping the brain’s activity with a thin sheet of up to 256 electrodes placed under the skull on the brain’s outer surface or cortex. These electrodes recorded activity in the temporal lobe — home to the auditory cortex.

In the experiments, patients listened to two speech samples played to them simultaneously in which different phrases were spoken by different speakers. They were asked to identify the words they heard spoken by one of the two speakers. The authors then applied new decoding methods to “reconstruct” what the subjects heard from analyzing their brain activity patterns. Strikingly, the authors found that neural responses in the auditory cortex only reflected those of the targeted speaker. They found that their decoding algorithm could predict which speaker and even what specific words the subject was listening to based on those neural patterns. In other words, they could tell when the listener’s attention strayed to another speaker. “The algorithm worked so well that we could predict not only the correct responses, but also even when they paid attention to the wrong word,” Chang said.

Researching the brain’s audio processing

UCSF neurosurgeon Dr. Edward Chang

The brain is not just processing sound, but performing very complex audio analysis. Revealing how our brains are wired to favor some auditory cues over others may inspire new approaches toward automating and improving how voice-activated electronic interfaces filter sounds in order to properly detect verbal commands.

UCSF researchers use electrocorticography (ECoG). The measured signals are synchronized postsynaptic potentials (local field potentials), recorded directly from the exposed surface of the cortex. The potentials occur primarily in cortical pyramidal cells, and thus must be conducted through several layers of the cerebral cortex, cerebrospinal fluid (CSF), pia mater, and arachnoid mater before reaching subdural recording electrodes placed just below the dura mater (outer cranial membrane).

How the brain can so effectively focus on a single voice is a problem of keen interest to the companies that make consumer technologies because of the tremendous future market for all kinds of electronic devices with voice-active interfaces. While the voice recognition technologies that enable such interfaces as Apple’s Siri have come a long way in the last few years, they are nowhere near as sophisticated as the human speech system.

An average person can walk into a noisy room and have a private conversation with relative ease — as if all the other voices in the room were muted. In fact, said Mesgarani, an engineer with a background in automatic speech recognition research, the engineering required to separate a single intelligible voice from a cacophony of speakers and background noise is a surprisingly difficult problem. Speech recognition, he said, is “something that humans are remarkably good at, but it turns out that machine emulation of this human ability is extremely difficult.”

Jim Harrison

Advertisement



Learn more about Electronic Products Magazine

Leave a Reply