Acoustic correlates of manner of articulation for Urdu stop consonants

Sarmad Hussain

doi:10.1121/1.411172

Calibration of consonant perception to room reverberation

10.1101/2020.09.01.277590 ◽

2020 ◽

Author(s):

Eleni Vlahou ◽

Kanako Ueno ◽

Barbara G. Shinn-Cunningham ◽

Norbert Kopčo

Keyword(s):

Feature Analysis ◽

Stop Consonants ◽

Phonetic Feature ◽

Phonetic Categories ◽

Phonetic Features ◽

Manner Of Articulation ◽

Small Advantage

AbstractPurposeWe examined how consonant perception is affected by a preceding speech carrier simulated in the same or a different room, for a broad range of consonants. Carrier room, carrier length, and carrier length/target room uncertainty were manipulated. A phonetic feature analysis tested which phonetic categories are most influenced by the acoustic context of the carrier.MethodTwo experiments were performed, each with 9 participants. Targets consisted of vowel-consonant (VC) syllables presented in one of 2 strongly reverberant rooms, preceded by a VC carrier presented either in the same room, a different reverberant room, or an anechoic room. In Experiment 1 the carrier length and the target room randomly varied from trial to trial while in Experiment 2 they were fixed within blocks of trials.ResultsCompared to the no-carrier condition, a consistent carrier provided only a small advantage for consonant perception, whereas inconsistent carriers disrupted performance significantly. For a different-room carrier, carrier length had an effect; performance dropped significantly in the 2-VC compared to the 4-VC carrier length. The only effect of carrier uncertainty was an overall drop in performance. Phonetic analysis showed that an inconsistent carrier significantly degraded identification of the manner of articulation, especially for stop consonants, and, in one of the rooms, also of voicing.ConclusionsCalibration of consonant perception to strong reverberation is exhibited through disruptions in perception when the room is switched. The strength of calibration varies across different consonants and phonetic features, as well as across rooms and durations of exposure to a given room.

Download Full-text

Non-durational acoustic correlates of word-initial consonant gemination in Kelantan Malay: The potential roles of amplitude and f0

Journal of the International Phonetic Association ◽

10.1017/s0025100318000142 ◽

2018 ◽

Vol 50 (1) ◽

pp. 23-60

Author(s):

Mohd Hilmi Hamzah ◽

John Hajek ◽

Janet Fletcher

Keyword(s):

Native Speakers ◽

Initial Position ◽

Acoustic Parameters ◽

Rare Word ◽

Acoustic Correlates ◽

Closure Duration ◽

Disyllabic Words ◽

Consonant Gemination ◽

Manner Of Articulation ◽

Magnitude Difference

This study reports on non-durational acoustic correlates of typologically rare word-initial consonant gemination in Kelantan Malay (KM) by focusing on two acoustic parameters – amplitude and f0. Given the unusual characteristics of the word-initial consonant contrast and its potential maintenance in domain-initial environments, this study sets to examine the extent to which amplitude and f0 can potentially characterise such a contrast in KM in addition to the cross-linguistically established acoustic correlate of closure duration. The production data involved elicited materials from sixteen KM native speakers. RMS and f0 values were measured at the start of the vowel following stops and sonorants produced in isolation (i.e. utterance-initial position) and in a carrier sentence (i.e. utterance-medial position). Results indicate that the consonant contrast is reflected in systematic differences in (i) vowel onset amplitude and f0 following the target consonant and (ii) the ratios of amplitude and f0 across two syllables of disyllabic words. There are also effects of utterance position, manner of articulation and voicing type on the magnitude of contrast between singletons and geminates with utterance-initial voiceless stops generally showing the greatest magnitude difference. The conclusion is drawn that the KM word-initial singleton/geminate consonant contrast can be associated with a set of acoustic parameters alongside closure duration.

Download Full-text

Dynamic spectral shape features as acoustic correlates for initial stop consonants

The Journal of the Acoustical Society of America ◽

10.1121/1.400735 ◽

1991 ◽

Vol 89 (6) ◽

pp. 2978-2991 ◽

Cited By ~ 35

Author(s):

Zaki B. Nossair ◽

Stephen A. Zahorian

Keyword(s):

Spectral Shape ◽

Stop Consonants ◽

Shape Features ◽

Acoustic Correlates

Download Full-text

Calibration of Consonant Perception to Room Reverberation

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00396 ◽

2021 ◽

pp. 1-21

Author(s):

Eleni Vlahou ◽

Kanako Ueno ◽

Barbara G. Shinn-Cunningham ◽

Norbert Kopčo

Keyword(s):

Feature Analysis ◽

Stop Consonants ◽

Phonetic Feature ◽

Target Uncertainty ◽

Place Of Articulation ◽

Phonetic Categories ◽

Phonetic Features ◽

Manner Of Articulation ◽

Different Levels

Purpose We examined how consonant perception is affected by a preceding speech carrier simulated in the same or a different room, for different classes of consonants. Carrier room, carrier length, and carrier length/target room uncertainty were manipulated. A phonetic feature analysis tested which phonetic categories are influenced by the manipulations in the acoustic context of the carrier. Method Two experiments were performed, each with nine participants. Targets consisted of 10 or 16 vowel–consonant (VC) syllables presented in one of two strongly reverberant rooms, preceded by a multiple-VC carrier presented in either the same room, a different reverberant room, or an anechoic room. In Experiment 1, the carrier length and the target room randomly varied from trial to trial, whereas in Experiment 2, they were fixed within a block of trials. Results Overall, a consistent carrier provided an advantage for consonant perception compared to inconsistent carriers, whether in anechoic or differently reverberant rooms. Phonetic analysis showed that carrier inconsistency significantly degraded identification of the manner of articulation, especially for stop consonants and, in one of the rooms, also of voicing. Carrier length and carrier/target uncertainty did not affect adaptation to reverberation for individual phonetic features. The detrimental effects of anechoic and different reverberant carriers on target perception were similar. Conclusions The strength of calibration varies across different phonetic features, as well as across rooms with different levels of reverberation. Even though place of articulation is the feature that is affected by reverberation the most, it is the manner of articulation and, partially, voicing for which room adaptation is observed.

Download Full-text

Single-cell activity in human STG during perception of phonemes is organized according to manner of articulation

10.1101/552315 ◽

2019 ◽

Author(s):

Yair Lakretz ◽

Ori Ossmy ◽

Naama Friedmann ◽

Roy Mukamel ◽

Itzhak Fried

Keyword(s):

Speech Perception ◽

Single Cell ◽

Cell Activity ◽

Superior Temporal Gyrus ◽

Acoustic Correlates ◽

Neurosurgical Patients ◽

Single Cell Responses ◽

Listening Task ◽

Manner Of Articulation ◽

Spiking Activity

AbstractA long-standing controversy persists in psycholinguistic research regarding the way phonemes are coded in human auditory cortex during speech perception. Whereas the motor theory of speech perception suggests that phonemes are organized in terms of common articulatory gestures that generate them, auditory theories argue that phonetic processing is organized based on common spectro-temporal patterns in phoneme waveforms. Here, we recorded spiking activity in the superior temporal gyrus (STG) from six neurosurgical patients who performed a listening task with phoneme stimuli. Using a Naïve-Bayes model, we show that single-cell responses to phonemes are governed by articulatory features that have acoustic correlates (manner-of-articulation) and organized according to sonority, with two main clusters for sonorants and obstruents. We further find that ‘neural similarity’ (i.e. the similarity of evoked spiking activity between pairs of phonemes), is comparable to the ‘perceptual similarity’ (i.e. how much the pair of phonemes sound similar) based on perceptual confusion assessed behaviorally in healthy subjects. Thus phonemes that were perceptually similar, also had similar neural responses. Our findings establish that phonemes are encoded according to manner-of-articulation, supporting the auditory theories of perception, and that the perceptual representation of phonemes can be reflected by the activity of single neurons in STG.

Download Full-text

Perceptual and Acoustic CorreLates of Gender in the Prepubertal Voice

10.21437/interspeech.2017-1055 ◽

2017 ◽

Cited By ~ 1

Author(s):

Adrian P. Simpson ◽

Riccarda Funk ◽

Frederik Palmer

Keyword(s):

Acoustic Correlates

Download Full-text

The acoustic correlates of stress and accent in English content and function words

10.21437/speechprosody.2016-89 ◽

2016 ◽

Cited By ~ 1

Author(s):

Robert Fuchs

Keyword(s):

Function Words ◽

Acoustic Correlates ◽

And Function

Download Full-text

Phonetic feature size in second language acquisition: Examining VOT in voiceless and voiced stops

Second language Research ◽

10.1177/02676583211008951 ◽

2021 ◽

pp. 026765832110089

Author(s):

Daniel J Olson

Keyword(s):

Second Language ◽

Second Language Acquisition ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Acoustic Similarity ◽

Stop Consonants ◽

Voiceless Stop ◽

English Speaking ◽

Underlying Mechanisms

Featural approaches to second language phonetic acquisition posit that the development of new phonetic norms relies on sub-phonemic features, expressed through a constellation of articulatory gestures and their corresponding acoustic cues, which may be shared across multiple phonemes. Within featural approaches, largely supported by research in speech perception, debate remains as to the fundamental scope or ‘size’ of featural units. The current study examines potential featural relationships between voiceless and voiced stop consonants, as expressed through the voice onset time cue. Native English-speaking learners of Spanish received targeted training on Spanish voiceless stop consonant production through a visual feedback paradigm. Analysis focused on the change in voice onset time, for both voiceless (i.e. trained) and voiced (i.e. non-trained) phonemes, across the pretest, posttest, and delayed posttest. The results demonstrated a significant improvement (i.e. reduction) in voice onset time for voiceless stops, which were subject to the training paradigm. In contrast, there was no significant change in the non-trained voiced stop consonants. These results suggest a limited featural relationship, with independent voice onset time (VOT) cues for voiceless and voices phonemes. Possible underlying mechanisms that limit feature generalization in second language (L2) phonetic production, including gestural considerations and acoustic similarity, are discussed.

Download Full-text

Temporal Encoding of the Voice Onset Time Phonetic Parameter by Field Potentials Recorded Directly From Human Auditory Cortex

Journal of Neurophysiology ◽

10.1152/jn.1999.82.5.2346 ◽

1999 ◽

Vol 82 (5) ◽

pp. 2346-2357 ◽

Cited By ~ 120

Author(s):

Mitchell Steinschneider ◽

Igor O. Volkov ◽

M. Daniel Noh ◽

P. Charles Garell ◽

Matthew A. Howard

Keyword(s):

Auditory Cortex ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Superior Temporal Gyrus ◽

Response Patterns ◽

Stop Consonants ◽

Field Potentials ◽

Heschl’S Gyrus ◽

Heschl's Gyrus

Voice onset time (VOT) is an important parameter of speech that denotes the time interval between consonant onset and the onset of low-frequency periodicity generated by rhythmic vocal cord vibration. Voiced stop consonants (/b/, /g/, and /d/) in syllable initial position are characterized by short VOTs, whereas unvoiced stop consonants (/p/, /k/, and t/) contain prolonged VOTs. As the VOT is increased in incremental steps, perception rapidly changes from a voiced stop consonant to an unvoiced consonant at an interval of 20–40 ms. This abrupt change in consonant identification is an example of categorical speech perception and is a central feature of phonetic discrimination. This study tested the hypothesis that VOT is represented within auditory cortex by transient responses time-locked to consonant and voicing onset. Auditory evoked potentials (AEPs) elicited by stop consonant-vowel (CV) syllables were recorded directly from Heschl's gyrus, the planum temporale, and the superior temporal gyrus in three patients undergoing evaluation for surgical remediation of medically intractable epilepsy. Voiced CV syllables elicited a triphasic sequence of field potentials within Heschl's gyrus. AEPs evoked by unvoiced CV syllables contained additional response components time-locked to voicing onset. Syllables with a VOT of 40, 60, or 80 ms evoked components time-locked to consonant release and voicing onset. In contrast, the syllable with a VOT of 20 ms evoked a markedly diminished response to voicing onset and elicited an AEP very similar in morphology to that evoked by the syllable with a 0-ms VOT. Similar response features were observed in the AEPs evoked by click trains. In this case, there was a marked decrease in amplitude of the transient response to the second click in trains with interpulse intervals of 20–25 ms. Speech-evoked AEPs recorded from the posterior superior temporal gyrus lateral to Heschl's gyrus displayed comparable response features, whereas field potentials recorded from three locations in the planum temporale did not contain components time-locked to voicing onset. This study demonstrates that VOT at least partially is represented in primary and specific secondary auditory cortical fields by synchronized activity time-locked to consonant release and voicing onset. Furthermore, AEPs exhibit features that may facilitate categorical perception of stop consonants, and these response patterns appear to be based on temporal processing limitations within auditory cortex. Demonstrations of similar speech-evoked response patterns in animals support a role for these experimental models in clarifying selected features of speech encoding.

Download Full-text