Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker’s lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker’s lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing.

Download Full-text

Visual speech is processed differently in auditory and visual cortex: evidence from MEG and partial coherence analysis

10.1101/2021.12.18.472955 ◽

2021 ◽

Author(s):

Mate Aller ◽

Heidi Solberg Okland ◽

Lucy J MacGregor ◽

Helen Blank ◽

Matthew H. Davis

Keyword(s):

Visual Cortex ◽

Speech Perception ◽

Phase Locking ◽

Visual Speech ◽

Visual Signals ◽

Partial Coherence ◽

Speech Signals ◽

Coherence Analysis ◽

Modal Interactions ◽

Auditory Signals

Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase locking to auditory and visual signals in MEG recordings from 14 human participants (6 female) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared to audio-only (AO) speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus (STG) did not show above-chance partial coherence with visual speech signals during AV conditions, but did show partial coherence in VO conditions. Hence, visual speech enabled stronger phase locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.

Download Full-text

Increased connectivity among sensory and motor regions during visual and audiovisual speech perception

10.1101/2020.12.15.422726 ◽

2020 ◽

Author(s):

Jonathan E Peelle ◽

Brent Spehar ◽

Michael S Jones ◽

Sarah McConkey ◽

Joel Myerson ◽

...

Keyword(s):

Visual Cortex ◽

Speech Perception ◽

Brain Activity ◽

Premotor Cortex ◽

Temporal Cortex ◽

Primary Auditory Cortex ◽

Visual Speech ◽

Auditory Signal ◽

Audiovisual Speech ◽

Audiovisual Speech Perception

In everyday conversation, we usually process the talker's face as well as the sound of their voice. Access to visual speech information is particularly useful when the auditory signal is degraded. Here we used fMRI to monitor brain activity while adults (n = 60) were presented with visual-only, auditory-only, and audiovisual words. As expected, audiovisual speech perception recruited both auditory and visual cortex, with a trend towards increased recruitment of premotor cortex in more difficult conditions (for example, in substantial background noise). We then investigated neural connectivity using psychophysiological interaction (PPI) analysis with seed regions in both primary auditory cortex and primary visual cortex. Connectivity between auditory and visual cortices was stronger in audiovisual conditions than in unimodal conditions, including a wide network of regions in posterior temporal cortex and prefrontal cortex. Taken together, our results suggest a prominent role for cross-region synchronization in understanding both visual-only and audiovisual speech.

Download Full-text

Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

eLife ◽

10.7554/elife.31640 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 9

Author(s):

Markus Johannes Van Ackeren ◽

Francesca M Barbero ◽

Stefania Mattioni ◽

Roberto Bottini ◽

Olivier Collignon

Keyword(s):

Visual Cortex ◽

Speech Processing ◽

Speech Intelligibility ◽

Temporal Dynamics ◽

Occipital Cortex ◽

Sensory Inputs ◽

Neuronal Populations ◽

Acoustic Fluctuations ◽

Blind Individuals ◽

Theta Range

The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives.

Download Full-text

Auditory speech processing is affected by visual speech in the periphery

10.21437/interspeech.2011-591 ◽

2011 ◽

Author(s):

Jeesun Kim ◽

Chris Davis

Keyword(s):

Speech Processing ◽

Visual Speech ◽

Auditory Speech

Download Full-text

Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

10.1101/186338 ◽

2017 ◽

Cited By ~ 1

Author(s):

Markus J. van Ackeren ◽

Francesca Barbero ◽

Stefania Mattioni ◽

Roberto Bottini ◽

Olivier Collignon

Keyword(s):

Visual Cortex ◽

Speech Processing ◽

Speech Intelligibility ◽

Temporal Dynamics ◽

Occipital Cortex ◽

Sensory Inputs ◽

Neuronal Populations ◽

Acoustic Fluctuations ◽

Blind Individuals ◽

Theta Range

AbstractThe occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB?Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture allowing the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives.

Download Full-text

Contributions of local speech encoding and functional connectivity to audio-visual speech perception

eLife ◽

10.7554/elife.24763 ◽

2017 ◽

Vol 6 ◽

Cited By ~ 36

Author(s):

Bruno L Giordano ◽

Robin A A Ince ◽

Joachim Gross ◽

Philippe G Schyns ◽

Stefano Panzeri ◽

...

Keyword(s):

Functional Connectivity ◽

Frontal Cortex ◽

Speech Intelligibility ◽

Brain Activity ◽

Visual Speech ◽

Speech Comprehension ◽

Low Snr ◽

Speech Encoding ◽

Inferior Frontal Cortex ◽

Underlying Network

Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments.

Download Full-text

Reading Fluent Speech from Talking Faces: Typical Brain Networks and Individual Differences

Journal of Cognitive Neuroscience ◽

10.1162/0898929054021175 ◽

2005 ◽

Vol 17 (6) ◽

pp. 939-953 ◽

Cited By ~ 71

Author(s):

Deborah A. Hall ◽

Clayton Fussell ◽

A. Quentin Summerfield

Keyword(s):

Individual Differences ◽

Speech Processing ◽

Superior Temporal Gyrus ◽

Brain Regions ◽

Receptive Language ◽

Visual Input ◽

Visual Speech ◽

Middle Temporal ◽

Blank Screen ◽

Auditory Speech

Listeners are able to extract important linguistic information by viewing the talker's face—a process known as “speechreading.” Previous studies of speechreading present small closed sets of simple words and their results indicate that visual speech processing engages a wide network of brain regions in the temporal, frontal, and parietal lobes that are likely to underlie multiple stages of the receptive language system. The present study further explored this network in a large group of subjects by presenting naturally spoken sentences which tap the richer complexities of visual speech processing. Four different baselines (blank screen, static face, nonlinguistic facial gurning, and auditory speech) enabled us to determine the hierarchy of neural processing involved in speechreading and to test the claim that visual input reliably accesses sound-based representations in the auditory cortex. In contrast to passively viewing a blank screen, the static-face condition evoked activation bilaterally across the border of the fusiform gyrus and cerebellum, and in the medial superior frontal gyrus and left precentral gyrus (p < .05, whole brain corrected). With the static face as baseline, the gurning face evoked bilateral activation in the motion-sensitive region of the occipital cortex, whereas visual speech additionally engaged the middle temporal gyrus, inferior and middle frontal gyri, and the inferior parietal lobe, particularly in the left hemisphere. These latter regions are implicated in lexical stages of spoken language processing. Although auditory speech generated extensive bilateral activation across both superior and middle temporal gyri, the group-averaged pattern of speechreading activation failed to include any auditory regions along the superior temporal gyrus, suggesting that fluent visual speech does not always involve sound-based coding of the visual input. An important finding from the individual subject analyses was that activation in the superior temporal gyrus did reach significance (p < .001, small-volume corrected) for a subset of the group. Moreover, the extent of the left-sided superior temporal gyrus activity was strongly correlated with speech-reading performance. Skilled speechreading was also associated with activations and deactivations in other brain regions, suggesting that individual differences reflect the efficiency of a circuit linking sensory, perceptual, memory, cognitive, and linguistic processes rather than the operation of a single component process.

Download Full-text

Causally investigating cortical dynamics and signal processing by targeting natural system attractors with precisely timed stimulation

10.1101/454538 ◽

2018 ◽

Author(s):

Dmitriy Lisitsyn ◽

Udo A. Ernst

Keyword(s):

Visual Cortex ◽

Cognitive Processing ◽

Brain Activity ◽

Intrinsic Noise ◽

Visual Signals ◽

Gamma Activity ◽

Oscillatory Activity ◽

Response Characteristics ◽

Stimulation Pulse ◽

Signal Routing

1AbstractElectrical stimulation is a promising tool for interacting with neuronal dynamics to identify neural mechanisms that underlie cognitive function. Since effects of a single short stimulation pulse typically vary greatly and depend on the current network state, many experimental paradigms have rather resorted to continuous or periodic stimulation in order to establish and maintain a desired effect. However, such an approach explicitly leads to forced and ‘unnatural’ brain activity. Further, continuous stimulation can make it hard to parse the recorded activity and separate neural signal from stimulation artifacts. In this study we propose an alternate strategy: by monitoring a system in realtime, we use the existing preferred states or attractors of the network and to apply short and precise pulses in order to switch between its preferred states. When pushed into one of its attractors, one can use the natural tendency of the system to remain in such a state to prolong the effect of a stimulation pulse, opening a larger window of opportunity to observe the consequences on cognitive processing. To elaborate on this idea, we consider flexible information routing in the visual cortex as a prototypical example. When processing a stimulus, neural populations in the visual cortex have been found to engage in synchronized gamma activity. In this context, selective signal routing is achieved by changing the relative phase between oscillatory activity in sending and receiving populations (communication through coherence, CTC). In order to explore how perturbations interact with CTC, we investigate a biophysically realistic network exhibiting similar synchronization and signal routing phenomena. We develop a closed-loop stimulation paradigm based on the phase-response characteristics of the network and demonstrate its ability to establish desired synchronization states. By measuring information content throughout the model, we evaluate the effect of signal contamination caused by the stimulation in relation to the magnitude of the injected pulses and intrinsic noise in the system. Finally, we demonstrate that, up to a critical noise level, precisely timed perturbations can be used to artificially induce the effect of attention by selectively routing visual signals to higher cortical areas.

Download Full-text

Visual speech form influences the speed of auditory speech processing

Brain and Language ◽

10.1016/j.bandl.2013.06.008 ◽

2013 ◽

Vol 126 (3) ◽

pp. 350-356 ◽

Cited By ~ 9

Author(s):

Tim Paris ◽

Jeesun Kim ◽

Chris Davis

Keyword(s):

Speech Processing ◽

Visual Speech ◽

Auditory Speech

Download Full-text

Responses to Visual Speech in Human Posterior Superior Temporal Gyrus Examined with iEEG Deconvolution

10.1101/2020.04.16.045716 ◽

2020 ◽

Author(s):

Brian A. Metzger ◽

John F. Magnotti ◽

Zhengjia Wang ◽

Elizabeth Nesbitt ◽

Patrick J. Karas ◽

...

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Time Course ◽

Brain Area ◽

Superior Temporal Gyrus ◽

Visual Speech ◽

Audiovisual Speech ◽

Neural Responses ◽

Auditory Speech ◽

Human Epilepsy

AbstractExperimentalists studying multisensory integration compare neural responses to multisensory stimuli with responses to the component modalities presented in isolation. This procedure is problematic for multisensory speech perception since audiovisual speech and auditory-only speech are easily intelligible but visual-only speech is not. To overcome this confound, we developed intracranial encephalography (iEEG) deconvolution. Individual stimuli always contained both auditory and visual speech but jittering the onset asynchrony between modalities allowed for the time course of the unisensory responses and the interaction between them to be independently estimated. We applied this procedure to electrodes implanted in human epilepsy patients (both male and female) over the posterior superior temporal gyrus (pSTG), a brain area known to be important for speech perception. iEEG deconvolution revealed sustained, positive responses to visual-only speech and larger, phasic responses to auditory-only speech. Confirming results from scalp EEG, responses to audiovisual speech were weaker than responses to auditory- only speech, demonstrating a subadditive multisensory neural computation. Leveraging the spatial resolution of iEEG, we extended these results to show that subadditivity is most pronounced in more posterior aspects of the pSTG. Across electrodes, subadditivity correlated with visual responsiveness, supporting a model in visual speech enhances the efficiency of auditory speech processing in pSTG. The ability to separate neural processes may make iEEG deconvolution useful for studying a variety of complex cognitive and perceptual tasks.Significance statementUnderstanding speech is one of the most important human abilities. Speech perception uses information from both the auditory and visual modalities. It has been difficult to study neural responses to visual speech because visual-only speech is difficult or impossible to comprehend, unlike auditory-only and audiovisual speech. We used intracranial encephalography (iEEG) deconvolution to overcome this obstacle. We found that visual speech evokes a positive response in the human posterior superior temporal gyrus, enhancing the efficiency of auditory speech processing.

Download Full-text