scholarly journals Lip movements entrain the observers’ low-frequency brain oscillations to facilitate speech intelligibility

eLife ◽  
2016 ◽  
Vol 5 ◽  
Author(s):  
Hyojin Park ◽  
Christoph Kayser ◽  
Gregor Thut ◽  
Joachim Gross

During continuous speech, lip movements provide visual temporal signals that facilitate speech processing. Here, using MEG we directly investigated how these visual signals interact with rhythmic brain activity in participants listening to and seeing the speaker. First, we investigated coherence between oscillatory brain activity and speaker’s lip movements and demonstrated significant entrainment in visual cortex. We then used partial coherence to remove contributions of the coherent auditory speech signal from the lip-brain coherence. Comparing this synchronization between different attention conditions revealed that attending visual speech enhances the coherence between activity in visual cortex and the speaker’s lips. Further, we identified a significant partial coherence between left motor cortex and lip movements and this partial coherence directly predicted comprehension accuracy. Our results emphasize the importance of visually entrained and attention-modulated rhythmic brain activity for the enhancement of audiovisual speech processing.

2021 ◽  
Author(s):  
Mate Aller ◽  
Heidi Solberg Okland ◽  
Lucy J MacGregor ◽  
Helen Blank ◽  
Matthew H. Davis

Speech perception in noisy environments is enhanced by seeing facial movements of communication partners. However, the neural mechanisms by which audio and visual speech are combined are not fully understood. We explore MEG phase locking to auditory and visual signals in MEG recordings from 14 human participants (6 female) that reported words from single spoken sentences. We manipulated the acoustic clarity and visual speech signals such that critical speech information is present in auditory, visual or both modalities. MEG coherence analysis revealed that both auditory and visual speech envelopes (auditory amplitude modulations and lip aperture changes) were phase-locked to 2-6Hz brain responses in auditory and visual cortex, consistent with entrainment to syllable-rate components. Partial coherence analysis was used to separate neural responses to correlated audio-visual signals and showed non-zero phase locking to auditory envelope in occipital cortex during audio-visual (AV) speech. Furthermore, phase-locking to auditory signals in visual cortex was enhanced for AV speech compared to audio-only (AO) speech that was matched for intelligibility. Conversely, auditory regions of the superior temporal gyrus (STG) did not show above-chance partial coherence with visual speech signals during AV conditions, but did show partial coherence in VO conditions. Hence, visual speech enabled stronger phase locking to auditory signals in visual areas, whereas phase-locking of visual speech in auditory regions only occurred during silent lip-reading. Differences in these cross-modal interactions between auditory and visual speech signals are interpreted in line with cross-modal predictive mechanisms during speech perception.


2020 ◽  
Author(s):  
Jonathan E Peelle ◽  
Brent Spehar ◽  
Michael S Jones ◽  
Sarah McConkey ◽  
Joel Myerson ◽  
...  

In everyday conversation, we usually process the talker's face as well as the sound of their voice. Access to visual speech information is particularly useful when the auditory signal is degraded. Here we used fMRI to monitor brain activity while adults (n = 60) were presented with visual-only, auditory-only, and audiovisual words. As expected, audiovisual speech perception recruited both auditory and visual cortex, with a trend towards increased recruitment of premotor cortex in more difficult conditions (for example, in substantial background noise). We then investigated neural connectivity using psychophysiological interaction (PPI) analysis with seed regions in both primary auditory cortex and primary visual cortex. Connectivity between auditory and visual cortices was stronger in audiovisual conditions than in unimodal conditions, including a wide network of regions in posterior temporal cortex and prefrontal cortex. Taken together, our results suggest a prominent role for cross-region synchronization in understanding both visual-only and audiovisual speech.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Markus Johannes Van Ackeren ◽  
Francesca M Barbero ◽  
Stefania Mattioni ◽  
Roberto Bottini ◽  
Olivier Collignon

The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives.


2017 ◽  
Author(s):  
Markus J. van Ackeren ◽  
Francesca Barbero ◽  
Stefania Mattioni ◽  
Roberto Bottini ◽  
Olivier Collignon

AbstractThe occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB?Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture allowing the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives.


eLife ◽  
2017 ◽  
Vol 6 ◽  
Author(s):  
Bruno L Giordano ◽  
Robin A A Ince ◽  
Joachim Gross ◽  
Philippe G Schyns ◽  
Stefano Panzeri ◽  
...  

Seeing a speaker’s face enhances speech intelligibility in adverse environments. We investigated the underlying network mechanisms by quantifying local speech representations and directed connectivity in MEG data obtained while human participants listened to speech of varying acoustic SNR and visual context. During high acoustic SNR speech encoding by temporally entrained brain activity was strong in temporal and inferior frontal cortex, while during low SNR strong entrainment emerged in premotor and superior frontal cortex. These changes in local encoding were accompanied by changes in directed connectivity along the ventral stream and the auditory-premotor axis. Importantly, the behavioral benefit arising from seeing the speaker’s face was not predicted by changes in local encoding but rather by enhanced functional connectivity between temporal and inferior frontal cortex. Our results demonstrate a role of auditory-frontal interactions in visual speech representations and suggest that functional connectivity along the ventral pathway facilitates speech comprehension in multisensory environments.


2005 ◽  
Vol 17 (6) ◽  
pp. 939-953 ◽  
Author(s):  
Deborah A. Hall ◽  
Clayton Fussell ◽  
A. Quentin Summerfield

Listeners are able to extract important linguistic information by viewing the talker's face—a process known as “speechreading.” Previous studies of speechreading present small closed sets of simple words and their results indicate that visual speech processing engages a wide network of brain regions in the temporal, frontal, and parietal lobes that are likely to underlie multiple stages of the receptive language system. The present study further explored this network in a large group of subjects by presenting naturally spoken sentences which tap the richer complexities of visual speech processing. Four different baselines (blank screen, static face, nonlinguistic facial gurning, and auditory speech) enabled us to determine the hierarchy of neural processing involved in speechreading and to test the claim that visual input reliably accesses sound-based representations in the auditory cortex. In contrast to passively viewing a blank screen, the static-face condition evoked activation bilaterally across the border of the fusiform gyrus and cerebellum, and in the medial superior frontal gyrus and left precentral gyrus (p < .05, whole brain corrected). With the static face as baseline, the gurning face evoked bilateral activation in the motion-sensitive region of the occipital cortex, whereas visual speech additionally engaged the middle temporal gyrus, inferior and middle frontal gyri, and the inferior parietal lobe, particularly in the left hemisphere. These latter regions are implicated in lexical stages of spoken language processing. Although auditory speech generated extensive bilateral activation across both superior and middle temporal gyri, the group-averaged pattern of speechreading activation failed to include any auditory regions along the superior temporal gyrus, suggesting that fluent visual speech does not always involve sound-based coding of the visual input. An important finding from the individual subject analyses was that activation in the superior temporal gyrus did reach significance (p < .001, small-volume corrected) for a subset of the group. Moreover, the extent of the left-sided superior temporal gyrus activity was strongly correlated with speech-reading performance. Skilled speechreading was also associated with activations and deactivations in other brain regions, suggesting that individual differences reflect the efficiency of a circuit linking sensory, perceptual, memory, cognitive, and linguistic processes rather than the operation of a single component process.


2018 ◽  
Author(s):  
Dmitriy Lisitsyn ◽  
Udo A. Ernst

1AbstractElectrical stimulation is a promising tool for interacting with neuronal dynamics to identify neural mechanisms that underlie cognitive function. Since effects of a single short stimulation pulse typically vary greatly and depend on the current network state, many experimental paradigms have rather resorted to continuous or periodic stimulation in order to establish and maintain a desired effect. However, such an approach explicitly leads to forced and ‘unnatural’ brain activity. Further, continuous stimulation can make it hard to parse the recorded activity and separate neural signal from stimulation artifacts. In this study we propose an alternate strategy: by monitoring a system in realtime, we use the existing preferred states or attractors of the network and to apply short and precise pulses in order to switch between its preferred states. When pushed into one of its attractors, one can use the natural tendency of the system to remain in such a state to prolong the effect of a stimulation pulse, opening a larger window of opportunity to observe the consequences on cognitive processing. To elaborate on this idea, we consider flexible information routing in the visual cortex as a prototypical example. When processing a stimulus, neural populations in the visual cortex have been found to engage in synchronized gamma activity. In this context, selective signal routing is achieved by changing the relative phase between oscillatory activity in sending and receiving populations (communication through coherence, CTC). In order to explore how perturbations interact with CTC, we investigate a biophysically realistic network exhibiting similar synchronization and signal routing phenomena. We develop a closed-loop stimulation paradigm based on the phase-response characteristics of the network and demonstrate its ability to establish desired synchronization states. By measuring information content throughout the model, we evaluate the effect of signal contamination caused by the stimulation in relation to the magnitude of the injected pulses and intrinsic noise in the system. Finally, we demonstrate that, up to a critical noise level, precisely timed perturbations can be used to artificially induce the effect of attention by selectively routing visual signals to higher cortical areas.


2013 ◽  
Vol 126 (3) ◽  
pp. 350-356 ◽  
Author(s):  
Tim Paris ◽  
Jeesun Kim ◽  
Chris Davis

2020 ◽  
Author(s):  
Brian A. Metzger ◽  
John F. Magnotti ◽  
Zhengjia Wang ◽  
Elizabeth Nesbitt ◽  
Patrick J. Karas ◽  
...  

AbstractExperimentalists studying multisensory integration compare neural responses to multisensory stimuli with responses to the component modalities presented in isolation. This procedure is problematic for multisensory speech perception since audiovisual speech and auditory-only speech are easily intelligible but visual-only speech is not. To overcome this confound, we developed intracranial encephalography (iEEG) deconvolution. Individual stimuli always contained both auditory and visual speech but jittering the onset asynchrony between modalities allowed for the time course of the unisensory responses and the interaction between them to be independently estimated. We applied this procedure to electrodes implanted in human epilepsy patients (both male and female) over the posterior superior temporal gyrus (pSTG), a brain area known to be important for speech perception. iEEG deconvolution revealed sustained, positive responses to visual-only speech and larger, phasic responses to auditory-only speech. Confirming results from scalp EEG, responses to audiovisual speech were weaker than responses to auditory- only speech, demonstrating a subadditive multisensory neural computation. Leveraging the spatial resolution of iEEG, we extended these results to show that subadditivity is most pronounced in more posterior aspects of the pSTG. Across electrodes, subadditivity correlated with visual responsiveness, supporting a model in visual speech enhances the efficiency of auditory speech processing in pSTG. The ability to separate neural processes may make iEEG deconvolution useful for studying a variety of complex cognitive and perceptual tasks.Significance statementUnderstanding speech is one of the most important human abilities. Speech perception uses information from both the auditory and visual modalities. It has been difficult to study neural responses to visual speech because visual-only speech is difficult or impossible to comprehend, unlike auditory-only and audiovisual speech. We used intracranial encephalography (iEEG) deconvolution to overcome this obstacle. We found that visual speech evokes a positive response in the human posterior superior temporal gyrus, enhancing the efficiency of auditory speech processing.


Sign in / Sign up

Export Citation Format

Share Document