The Effects of Fundamental Frequency Level on Voice Onset Time in Normal Adult Male Speakers

Christopher R. McCrea; Richard J. Morris

doi:10.1044/1092-4388(2005/069)

The Effects of Fundamental Frequency Level on Voice Onset Time in Normal Adult Male Speakers

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2005/069) ◽

2005 ◽

Vol 48 (5) ◽

pp. 1013-1024 ◽

Cited By ~ 20

Author(s):

Christopher R. McCrea ◽

Richard J. Morris

Keyword(s):

Fundamental Frequency ◽

Adult Male ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Initial Position ◽

Normal Adult ◽

Frequency Level ◽

Main Effect ◽

Voiceless Stop

The purpose of this study was to examine the effect of fundamental frequency (F 0 ) on stop consonant voice onset time (VOT). VOT was measured from the recordings of 56 young men reading phrases containing all 6 English voiced and voiceless stops in word-initial position across high-, medium-, and low-F 0 levels. Separate analyses of variance for the voiced and voiceless stops revealed no significant main effect for F 0 for the voiced stops but a significant F 0 effect for the voiceless stops. Across the voiceless stops, productions at high F 0 s displayed significantly shorter VOTs than productions at low or mid F 0 s. The findings indicated that researchers must take into account the F 0 level at which voiceless stop VOT is measured.

Download Full-text

Analysis ofStutterers' Voice Onset Times and Fundamental Frequency Contours during Fluency

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.2702.219 ◽

1984 ◽

Vol 27 (2) ◽

pp. 219-225 ◽

Cited By ~ 24

Author(s):

E. Charles Healey ◽

Barbara Gutkin

Keyword(s):

Fundamental Frequency ◽

Adult Male ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Past Research ◽

Group Differences ◽

Future Research ◽

One Year

The purpose of this study was to examine stutterers' and nonstutterers' fluent voice onset time (VOT) and fundamental frequency (F 0 ) contour measures from target syllables located at the beginning of a carrier phrase. Ten adult male stutterers were matched within one year of age with 10 adult male nonstutterers. Oscillographic and spectrographic analyses of subjects' VOT and F 0 at vowel onset, average vowel F 0 , and speed and range of Fo change were obtained from fluent productions of 18 stop consonant-vowel syllables. Results showed that VOTs for voiced stops and the range of F 0 change for voiceless stops were associated with significant between-group differences. All other dependent measures were not significantly different between the two groups. When eompared with past research, these findings indicate that greater differences emerge between stutterers and nonstutterers when measures of fluency are taken at the beginning than in the middle of a carrier phrase. Implications for future research are discussed.

Download Full-text

Voice onset time in Persian initial and intervocalic stop production

Journal of the International Phonetic Association ◽

10.1017/s0025100309990168 ◽

2009 ◽

Vol 39 (3) ◽

pp. 335-364 ◽

Cited By ~ 10

Author(s):

Mahmood Bijankhan ◽

Mandana Nourbakhsh

Keyword(s):

Sex Differences ◽

Fundamental Frequency ◽

Voice Onset Time ◽

Onset Time ◽

Initial Position ◽

Place Of Articulation ◽

Vowel Context ◽

Significant Difference ◽

Voiceless Stop ◽

High Vowels

The purpose of this study is to examine voice onset time as a phonetic correlate of voicing distinction in standard Persian. Issues pertinent to VOT are also addressed: namely, the effect of place of articulation, vowel context and sex of speakers. The VOTs were measured from recordings of five male and five female speakers reading 65 words that contained a full set of Persian oral stops in word initial and intervocalic positions. This acoustic experiment indicated that VOT distinguishes voiced from voiceless stops. The results also revealed that Persian uses mainly {voiceless unaspirated} and {voiceless aspirated} categories for [±voice] distinction in initial position and {voiced} and {voiceless aspirated} categories in intervocalic position. Vowel context also affected VOT values but the only significant difference was due to high vowels, which caused the preceding voiceless stop to have a longer VOT. Examining sex differences in the VOT values indicated that for voiced items females produced longer VOTs than males. However, voiceless items displayed no significant sex differences for VOT values. Fundamental frequency (F0) of the onset of the following vowel was also examined as another cue to voice distinction. Although the F0 values of voiceless tokens were higher than those of the voiced ones in each voiced–voiceless category, the results suggest that F0 is not a major cue distinguishing the two stop categories.

Download Full-text

Phonetic feature size in second language acquisition: Examining VOT in voiceless and voiced stops

Second language Research ◽

10.1177/02676583211008951 ◽

2021 ◽

pp. 026765832110089

Author(s):

Daniel J Olson

Keyword(s):

Second Language ◽

Second Language Acquisition ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Acoustic Similarity ◽

Stop Consonants ◽

Voiceless Stop ◽

English Speaking ◽

Underlying Mechanisms

Featural approaches to second language phonetic acquisition posit that the development of new phonetic norms relies on sub-phonemic features, expressed through a constellation of articulatory gestures and their corresponding acoustic cues, which may be shared across multiple phonemes. Within featural approaches, largely supported by research in speech perception, debate remains as to the fundamental scope or ‘size’ of featural units. The current study examines potential featural relationships between voiceless and voiced stop consonants, as expressed through the voice onset time cue. Native English-speaking learners of Spanish received targeted training on Spanish voiceless stop consonant production through a visual feedback paradigm. Analysis focused on the change in voice onset time, for both voiceless (i.e. trained) and voiced (i.e. non-trained) phonemes, across the pretest, posttest, and delayed posttest. The results demonstrated a significant improvement (i.e. reduction) in voice onset time for voiceless stops, which were subject to the training paradigm. In contrast, there was no significant change in the non-trained voiced stop consonants. These results suggest a limited featural relationship, with independent voice onset time (VOT) cues for voiceless and voices phonemes. Possible underlying mechanisms that limit feature generalization in second language (L2) phonetic production, including gestural considerations and acoustic similarity, are discussed.

Download Full-text

Stop consonant productions of Korean–English bilingual children

Bilingualism Language and Cognition ◽

10.1017/s1366728911000083 ◽

2011 ◽

Vol 15 (2) ◽

pp. 275-287 ◽

Cited By ~ 17

Author(s):

SUE ANN S. LEE ◽

GREGORY K. IVERSON

Keyword(s):

Fundamental Frequency ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Developmental Period ◽

English Speakers ◽

Bilingual Children ◽

Stop Consonants ◽

Speech Sounds ◽

English Bilingual

The purpose of this study was to conduct an acoustic examination of the obstruent stops produced by Korean–English bilingual children in connection with the question of whether bilinguals establish distinct categories of speech sounds across languages. Stop productions were obtained from ninety children in two age ranges, five and ten years: thirty Korean–English bilinguals, thirty monolingual Koreans and thirty monolingual English speakers. Voice-Onset-Time (VOT) lag at word-initial stop and fundamental frequency (f0) in the following vowel (hereafter vowel-onset f0) were measured. The bilingual children showed different patterns of VOT in comparison to both English and Korean monolinguals, with longer VOT in their production of Korean stop consonants and shorter VOT for English. Moreover, the ten-year-old bilinguals distinguished all stop categories using both VOT and vowel-onset f0,whereas the five-year-olds tended to make stop distinctions based on VOT but not vowel-onset f0. The results of this study suggest that bilingual children at around five years of age do not yet have fully separate stop systems, and that the systems continue to evolve during the developmental period.

Download Full-text

Psychophysical Boundary for Categorization of Voiced–Voiceless Stop Consonants in Native Japanese Speakers

Journal of Speech Language and Hearing Research ◽

10.1044/2017_jslhr-h-17-0131 ◽

2018 ◽

Vol 61 (3) ◽

pp. 789-796 ◽

Cited By ~ 2

Author(s):

Shunsuke Tamura ◽

Kazuhito Ito ◽

Nobuyuki Hirose ◽

Shuji Mori

Keyword(s):

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Stop Consonants ◽

Noise Detection ◽

Native Japanese Speakers ◽

Simultaneity Judgment ◽

Japanese Speakers ◽

Voiceless Stop ◽

Speech Identification

Purpose The purpose of this study was to investigate the psychophysical boundary used for categorization of voiced–voiceless stop consonants in native Japanese speakers. Method Twelve native Japanese speakers participated in the experiment. The stimuli were synthetic stop consonant–vowel stimuli varying in voice onset time (VOT) with manipulation of the amplitude of the initial noise portion and the first formant (F1) frequency of the periodic portion. There were 3 tasks, namely, speech identification to either /d/ or /t/, detection of the noise portion, and simultaneity judgment of onsets of the noise and periodic portions. Results The VOT boundaries of /d/–/t/ were close to the shortest VOT values that allowed for detection of the noise portion but not to those for perceived nonsimultaneity of the noise and periodic portions. The slopes of noise detection functions along VOT were as sharp as those of voiced–voiceless identification functions. In addition, the effects of manipulating the amplitude of the noise portion and the F1 frequency of the periodic portion on the detection of the noise portion were similar to those on voiced–voiceless identification. Conclusion The psychophysical boundary of perception of the initial noise portion masked by the following periodic portion may be used for voiced–voiceless categorization by Japanese speakers.

Download Full-text

The production of the English stop voicing contrast by Arab L2 speakers of English

Indonesian Journal of Applied Linguistics ◽

10.17509/ijal.v10i2.28615 ◽

2020 ◽

Vol 10 (2) ◽

pp. 434-444

Author(s):

Mohd Hilmi Hamzah ◽

Ahmed Elsayed Samir Madbouly ◽

Hasliza Abdul Halim ◽

Abdul Halim Abdullah

Keyword(s):

Native Speakers ◽

Voice Onset Time ◽

Onset Time ◽

Initial Position ◽

Acoustic Parameter ◽

Voicing Contrast ◽

Phonological Contrast ◽

Fertile Ground ◽

Voiceless Stop ◽

Speech Learning

The English voiceless stop /p/ and voiced stop /ɡ/ are absent in the consonant inventory of Arabic. This difference provides a fertile ground for empirical research in L2 speech learning among Arab L2 speakers of English. The current study, therefore, aims to explore the English stop voicing contrast as produced by Arab native speakers. Focusing on Voice Onset Time (VOT) as an acoustic parameter, the study seeks to examine the extent to which (1) Arab L2 speakers of English maintain the English stop voicing contrast for /p-b/ and /k-ɡ/, and (2) the L2 VOT continuum by Arab L2 speakers follows or deviates from the L1 VOT continuum in English. The acoustic phonetic experiment involved elicited materials of /p-b/ and /k-ɡ/ from four male native speakers of Arabic. The tokens were recorded in isolation (utterance-initial position) and in a carrier sentence (utterance-medial position). The data were then acoustically analysed following standard segmentation, annotation and measurement criteria. Results reveal that the Arab L2 speakers can, to a large extent, maintain the English stop voicing contrast across all places of articulation, with voiced stops usually being produced with “normal” negative VOT (prevoicing) and voiceless stops usually being produced with “normal” positive VOT and also accompanied with aspiration in the long-lag region. There are also exceptional cases of “abnormal” negative VOT (prevoicing) for voiceless stops and “abnormal” positive VOT (devoicing) for voiced stops, with an extremely larger number of devoiced tokens for voiced stops in comparison to prevoiced tokens for voiceless stops. The results accord well with the Speech Learning Model’s prediction that phonetically “new” sounds are relatively easier to learn than phonetically “similar” sounds. The conclusion is drawn that languages sharing the same sound contrast may exhibit different phonetic implementations in marking a phonological contrast.

Download Full-text

Roles of Voice Onset Time and F0 in Stop Consonant Voicing Perception: Effects of Masking Noise and Low-Pass Filtering

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2012/12-0086) ◽

2013 ◽

Vol 56 (4) ◽

pp. 1097-1107 ◽

Cited By ~ 17

Author(s):

Matthew B. Winn ◽

Monita Chatterjee ◽

William J. Idsardi

Keyword(s):

Logistic Regression ◽

Fundamental Frequency ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Mixed Effects ◽

Stop Consonants ◽

Low Pass ◽

Masking Noise ◽

Phonetic Cues

Purpose The contributions of voice onset time (VOT) and fundamental frequency (F0) were evaluated for the perception of voicing in syllable-initial stop consonants in words that were low-pass filtered and/or masked by speech-shaped noise. It was expected that listeners would rely less on VOT and more on F0 in these degraded conditions. Method Twenty young listeners with normal hearing identified modified natural speech tokens that varied by VOT and F0 in several conditions of low-pass filtering and masking noise. Stimuli included /b/–/p/ and /d/–/t/ continua that were presented in separate blocks. Identification results were modeled using mixed-effects logistic regression. Results When speech was filtered and/or masked by noise, listeners' voicing perceptions were driven less by VOT and more by F0. Speech-shaped masking noise exerted greater effects on the /b/–/p/ contrast, while low-pass filtering exerted greater effects on the /d/–/t/ contrast, consistent with the acoustics of these contrasts. Conclusion Listeners can adjust their use of acoustic-phonetic cues in a dynamic way that is appropriate for challenging listening conditions; cues that are less influential in ideal conditions can gain priority in challenging conditions.

Download Full-text

On the Acquisition of English Voiceless Stop VOT by Indonesian-English Bilinguals: Evidence of Input Frequency

k ta ◽

10.9744/kata.20.2.45-52 ◽

2019 ◽

Vol 20 (2) ◽

pp. 45-52

Author(s):

Evynurul Laily Zen

Keyword(s):

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Close Relation ◽

Input Frequency ◽

Contributing Factors ◽

Interactive Communication ◽

Her Family ◽

Voiceless Stop ◽

The One

The paper attempted to investigate the acquisition of Voice Onset Time (VOT) of voiceless stop consonants of English /p/, /t/, and /k/ by Indonesian-English bilingual children in its close relation to how second language (L2) input shapes the L2 VOT production. It looked at two types of bilingual participants; (1) one 6-year-old participant receiving extensive input of English natives from YouTube in about 8 hours per day since she was two in addition to having an interactive communication in English with her family members (2) four students (aged 7-8 years old) of International Class Program with non-native environment of English. Both groups were residing in Malang, East Java, Indonesia at the time of data collection. The comparative analysis concluded that the VOT valued differ significantly across different inputs. The participants with non native input acquired much shorter VOTs falling within the average of 28 – 36 ms, while the one with native input could achieve native-like VOTs in the average of 69 ms for /p/ and /t/ and even longer for stop consonant /k/. Contributing factors of individual differences might arrive from input frequency levels, types of inputs, and complexities of phonological properties of Indonesian and English.

Download Full-text

Temporal Encoding of the Voice Onset Time Phonetic Parameter by Field Potentials Recorded Directly From Human Auditory Cortex

Journal of Neurophysiology ◽

10.1152/jn.1999.82.5.2346 ◽

1999 ◽

Vol 82 (5) ◽

pp. 2346-2357 ◽

Cited By ~ 120

Author(s):

Mitchell Steinschneider ◽

Igor O. Volkov ◽

M. Daniel Noh ◽

P. Charles Garell ◽

Matthew A. Howard

Keyword(s):

Auditory Cortex ◽

Voice Onset Time ◽

Onset Time ◽

Stop Consonant ◽

Superior Temporal Gyrus ◽

Response Patterns ◽

Stop Consonants ◽

Field Potentials ◽

Heschl’S Gyrus ◽

Heschl's Gyrus

Voice onset time (VOT) is an important parameter of speech that denotes the time interval between consonant onset and the onset of low-frequency periodicity generated by rhythmic vocal cord vibration. Voiced stop consonants (/b/, /g/, and /d/) in syllable initial position are characterized by short VOTs, whereas unvoiced stop consonants (/p/, /k/, and t/) contain prolonged VOTs. As the VOT is increased in incremental steps, perception rapidly changes from a voiced stop consonant to an unvoiced consonant at an interval of 20–40 ms. This abrupt change in consonant identification is an example of categorical speech perception and is a central feature of phonetic discrimination. This study tested the hypothesis that VOT is represented within auditory cortex by transient responses time-locked to consonant and voicing onset. Auditory evoked potentials (AEPs) elicited by stop consonant-vowel (CV) syllables were recorded directly from Heschl's gyrus, the planum temporale, and the superior temporal gyrus in three patients undergoing evaluation for surgical remediation of medically intractable epilepsy. Voiced CV syllables elicited a triphasic sequence of field potentials within Heschl's gyrus. AEPs evoked by unvoiced CV syllables contained additional response components time-locked to voicing onset. Syllables with a VOT of 40, 60, or 80 ms evoked components time-locked to consonant release and voicing onset. In contrast, the syllable with a VOT of 20 ms evoked a markedly diminished response to voicing onset and elicited an AEP very similar in morphology to that evoked by the syllable with a 0-ms VOT. Similar response features were observed in the AEPs evoked by click trains. In this case, there was a marked decrease in amplitude of the transient response to the second click in trains with interpulse intervals of 20–25 ms. Speech-evoked AEPs recorded from the posterior superior temporal gyrus lateral to Heschl's gyrus displayed comparable response features, whereas field potentials recorded from three locations in the planum temporale did not contain components time-locked to voicing onset. This study demonstrates that VOT at least partially is represented in primary and specific secondary auditory cortical fields by synchronized activity time-locked to consonant release and voicing onset. Furthermore, AEPs exhibit features that may facilitate categorical perception of stop consonants, and these response patterns appear to be based on temporal processing limitations within auditory cortex. Demonstrations of similar speech-evoked response patterns in animals support a role for these experimental models in clarifying selected features of speech encoding.

Download Full-text

Acoustic Integrity of Speech Production in Children With Moderate and Severe Hearing Impairment

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3501.88 ◽

1992 ◽

Vol 35 (1) ◽

pp. 88-95 ◽

Cited By ~ 23

Author(s):

John Ryalls ◽

Annie Larouche

Keyword(s):

Hearing Impairment ◽

Speech Production ◽

Fundamental Frequency ◽

Voice Onset Time ◽

Total Duration ◽

Onset Time ◽

Hearing Impaired ◽

Formant Frequencies ◽

Standard Deviations

Ten normally hearing and 10 age-matched subjects with moderate-to-severe hearing impairment were recorded producing a protocol of 18 basic syllables [/pi/,/pa/,/pu/; /bi/,/ba/,/bu/; /ti/,/ta/,/tu/; /di/,/da/,/du/; /ki/,/ka/,/ku/; /gi/,/ga/,/gu/] repeated five times. The resulting 90 syllables were digitized and measured for (a) total duration; (b) voice-onset time (VOT) of the initial consonant; (c) fundamental frequency (F 0 ) at midpoint of vowel; and (d) formant frequencies (F 1 , F 2 , F 3 ), also measured at midpoint of vowel. Statistical comparisons were conducted on (a) average values for each syllable, and (b) standard deviations. Although there were numerical differences between normally hearing and hearing-impaired groups, few differences were statistically significant.

Download Full-text