Speaker‐independent vowel classification based on fundamental frequency and formant frequencies

This article aims to look at the similarities and differences in the fundamental frequency and formant frequencies using the autocorrelation function and LPCfunction in GUI MATLAB 2012b on sound hijaiyah letters for adult male speaker beginner and expert based on makhraj pronunciation and both of speaker will be analysis on matching distance of the sound use DTW method on cepstrum. Subject for speech beginner makhraj pronunciation are taken from college student of Universitas Gunadarma and SITC aged 22 years old Data of the speech beginner makhraj pronunciation is recorded using MATLAB algorithm on GUI Subject for speech expert makhraj pronunciation are taken from previous research. They are 20-30 years old from the time of taking data. The sound will be extracted to get the value of the fundamental frequency and formant frequency. After getting both frequencies, it will be obtained analysis of the similarities and differences in the fundamental frequency and formant frequencies of speech beginner and expert and it will shows matching distance of both speech. The result is all of speech beginner and expert based on makhraj pronunciation have different values of fundamental frequency and formant frequency. Then the results of the analysis matching distance using method DTW showed that obtained in the range of 28.9746 to 136.4 between speech beginner and expert based on makhraj pronunciation. Keywords: fundamental frequency, formant frequency, hijaiyah letters, makhraj

Download Full-text

Field Propagation Experiments of Male African Savanna Elephant Rumbles: A Focus on the Transmission of Formant Frequencies

Animals ◽

10.3390/ani8100167 ◽

2018 ◽

Vol 8 (10) ◽

pp. 167 ◽

Cited By ~ 2

Author(s):

Anton Baotic ◽

Maxime Garcia ◽

Markus Boeckle ◽

Angela Stoeger

Keyword(s):

Fundamental Frequency ◽

Vocal Communication ◽

Vocal Tract ◽

Natural Habitat ◽

Ecological Factors ◽

Transmission Efficiency ◽

Long Distance ◽

African Savanna ◽

Formant Frequencies ◽

Resonance Frequencies

African savanna elephants live in dynamic fission–fusion societies and exhibit a sophisticated vocal communication system. Their most frequent call-type is the ‘rumble’, with a fundamental frequency (which refers to the lowest vocal fold vibration rate when producing a vocalization) near or in the infrasonic range. Rumbles are used in a wide variety of behavioral contexts, for short- and long-distance communication, and convey contextual and physical information. For example, maturity (age and size) is encoded in male rumbles by formant frequencies (the resonance frequencies of the vocal tract), having the most informative power. As sound propagates, however, its spectral and temporal structures degrade progressively. Our study used manipulated and resynthesized male social rumbles to simulate large and small individuals (based on different formant values) to quantify whether this phenotypic information efficiently transmits over long distances. To examine transmission efficiency and the potential influences of ecological factors, we broadcasted and re-recorded rumbles at distances of up to 1.5 km in two different habitats at the Addo Elephant National Park, South Africa. Our results show that rumbles were affected by spectral–temporal degradation over distance. Interestingly and unlike previous findings, the transmission of formants was better than that of the fundamental frequency. Our findings demonstrate the importance of formant frequencies for the efficiency of rumble propagation and the transmission of information content in a savanna elephant’s natural habitat.

Download Full-text

Associations Between Speaking Fundamental Frequency, Vowel Formant Frequencies, and Listener Perceptions of Speaker Gender and Vocal Femininity–Masculinity

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00747 ◽

2021 ◽

pp. 1-23

Author(s):

Yeptain Leung ◽

Jennifer Oates ◽

Siew-Pang Chan ◽

Viktória Papp

Keyword(s):

Fundamental Frequency ◽

Structural Equation ◽

Model Building ◽

Principal Component ◽

Equation Modeling ◽

Formant Frequencies ◽

Vowel Formant ◽

Vowel Space ◽

Australian English ◽

Speaking Fundamental Frequency

Purpose The aim of the study was to examine associations between speaking fundamental frequency ( f os ), vowel formant frequencies ( F ), listener perceptions of speaker gender, and vocal femininity–masculinity. Method An exploratory study was undertaken to examine associations between f os , F 1 – F 3 , listener perceptions of speaker gender (nominal scale), and vocal femininity–masculinity (visual analog scale). For 379 speakers of Australian English aged 18–60 years, f os mode and F 1 – F 3 (12 monophthongs; total of 36 F s) were analyzed on a standard reading passage. Seventeen listeners rated speaker gender and vocal femininity–masculinity on randomized audio recordings of these speakers. Results Model building using principal component analysis suggested the 36 F s could be succinctly reduced to seven principal components (PCs). Generalized structural equation modeling (with the seven PCs of F and f os as predictors) suggested that only F 2 and f os predicted listener perceptions of speaker gender (male, female, unable to decide). However, listener perceptions of vocal femininity–masculinity behaved differently and were predicted by F 1 , F 3 , and the contrast between monophthongs at the extremities of the F 1 acoustic vowel space, in addition to F 2 and f os . Furthermore, listeners' perceptions of speaker gender also influenced ratings of vocal femininity–masculinity substantially. Conclusion Adjusted odds ratios highlighted the substantially larger contribution of F to listener perceptions of speaker gender and vocal femininity–masculinity relative to f os than has previously been reported.

Download Full-text

Acoustic Integrity of Speech Production in Children With Moderate and Severe Hearing Impairment

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3501.88 ◽

1992 ◽

Vol 35 (1) ◽

pp. 88-95 ◽

Cited By ~ 23

Author(s):

John Ryalls ◽

Annie Larouche

Keyword(s):

Hearing Impairment ◽

Speech Production ◽

Fundamental Frequency ◽

Voice Onset Time ◽

Total Duration ◽

Onset Time ◽

Hearing Impaired ◽

Formant Frequencies ◽

Standard Deviations

Ten normally hearing and 10 age-matched subjects with moderate-to-severe hearing impairment were recorded producing a protocol of 18 basic syllables [/pi/,/pa/,/pu/; /bi/,/ba/,/bu/; /ti/,/ta/,/tu/; /di/,/da/,/du/; /ki/,/ka/,/ku/; /gi/,/ga/,/gu/] repeated five times. The resulting 90 syllables were digitized and measured for (a) total duration; (b) voice-onset time (VOT) of the initial consonant; (c) fundamental frequency (F 0 ) at midpoint of vowel; and (d) formant frequencies (F 1 , F 2 , F 3 ), also measured at midpoint of vowel. Statistical comparisons were conducted on (a) average values for each syllable, and (b) standard deviations. Although there were numerical differences between normally hearing and hearing-impaired groups, few differences were statistically significant.

Download Full-text

The Identification of a Speaker's Sex from Synthesized Vowels

Perceptual and Motor Skills ◽

10.2466/pms.1998.87.2.595 ◽

1998 ◽

Vol 87 (2) ◽

pp. 595-600 ◽

Cited By ~ 22

Author(s):

S. P. Whiteside

Keyword(s):

Fundamental Frequency ◽

Test Scores ◽

Test Analysis ◽

Listening Test ◽

Formant Frequencies ◽

Perceptual Salience ◽

Fundamental Frequencies

This experiment assessed whether fundamental frequency or formant frequencies have more perceptual salience in the identification of the sex of the speaker from synthesized vowels. Four sets of ten vowels were synthesized by combining fundamental frequencies and formant frequencies with different permutations 50 listeners took part in a listening test. Analysis of the listening test scores suggested that for 36 vowels, the fundamental frequency (F0) was probably the most salient perceptual cue. For the remaining four vowels, however, this was not the case as either the formant frequencies or the onset-offset patterns of the F0 appeared to have some perceptual salience.

Download Full-text

Phonetic Implementation of Prosodic Emphasis in Preschool-Aged Children and Adults: Probing the Development of Sensorimotor Speech Goals

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00017 ◽

2020 ◽

Vol 63 (6) ◽

pp. 1658-1674

Author(s):

Lucie Ménard ◽

Amélie Prémont ◽

Pamela Trudeau-Fisette ◽

Christine Turgeon ◽

Mark Tiede

Keyword(s):

Motor Control ◽

Fundamental Frequency ◽

Acoustic Signals ◽

Speech Signals ◽

Formant Frequencies ◽

French Speaking ◽

Tongue Position ◽

Tongue Movements ◽

Preschool Aged Children

Objective We aimed to investigate the production of contrastive emphasis in French-speaking 4-year-olds and adults. Based on previous work, we predicted that, due to their immature motor control abilities, preschool-aged children would produce smaller articulatory differences between emphasized and neutral syllables than adults. Method Ten 4-year-old children and 10 adult French speakers were recorded while repeating /bib/, /bub/, and /bab/ sequences in neutral and contrastive emphasis conditions. Synchronous recordings of tongue movements, lip and jaw positions, and speech signals were made. Lip positions and tongue shapes were analyzed; formant frequencies, amplitude, fundamental frequency, and duration were extracted from the acoustic signals; and between-vowel contrasts were calculated. Results Emphasized vowels were higher in pitch, intensity, and duration than their neutral counterparts in all participants. However, the effect of contrastive emphasis on lip position was smaller in children. Prosody did not affect tongue position in children, whereas it did in adults. As a result, children's productions were perceived less accurately than those of adults. Conclusion These findings suggest that 4-year-old children have not yet learned to produce hypoarticulated forms of phonemic goals to allow them to successfully contrast syllables and enhance prosodic saliency.

Download Full-text

Speaker-independent HMM-based voice conversion using quantized fundamental frequency

10.21437/interspeech.2010-495 ◽

2010 ◽

Author(s):

Takashi Nose ◽

Takao Kobayashi

Keyword(s):

Fundamental Frequency ◽

Voice Conversion ◽

Speaker Independent

Download Full-text

Hypoarticulation in infant-directed speech

Applied Psycholinguistics ◽

10.1017/s0142716417000480 ◽

2017 ◽

Vol 39 (1) ◽

pp. 67-87 ◽

Cited By ~ 2

Author(s):

KJELLRUN T. ENGLUND

Keyword(s):

Fundamental Frequency ◽

Statistical Analyses ◽

Large Sample ◽

Formant Frequencies ◽

Short Vowels ◽

The Difference ◽

Selective Increase ◽

Natural Situation

ABSTRACTAn established finding in research on infant-directed speech (IDS) is that vowels are hyperarticulated compared to adult-directed speech (ADS). Studies showing this investigate point vowels, leaving us with a rather weak foundation for concluding whether IDS vowels are hyperarticulated within a particular language. The aim of this study was to investigate a large sample of vowels in IDS and to elicit speech in a natural situation for mother and infant. Acoustical and statistical analyses for /æ:, æ, ø:, ɵ, o:, ɔ, y:, y, ʉ:, ʉ, e:, ɛ/ show a selective increase in formant frequencies for some vowel qualities. In addition, vowels had higher fundamental frequency and were generally longer in IDS, but the difference between long and short vowels were comparable between IDS and ADS. With an additional front articulation and less lip protrusion in IDS compared to ADS, it is argued that IDS is hypoarticulated.

Download Full-text

Vowel Classification Based on Fundamental Frequency and Formant Frequencies

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3604.694 ◽

1993 ◽

Vol 36 (4) ◽

pp. 694-700 ◽

Cited By ~ 37

Author(s):

James Hillenbrand ◽

Robert T. Gayvert

Keyword(s):

Fundamental Frequency ◽

Spectral Measurements ◽

Linear Discriminant ◽

Formant Frequencies ◽

Linear Frequency ◽

Frequency Scale ◽

Classification Technique ◽

Women And Children ◽

Linear Discriminant Classifier

A quadratic discriminant classification technique was used to classify spectral measurements from vowels spoken by men, women, and children. The parameters used to train the discriminant classifier consisted of various combinations of fundamental frequency and the three lowest formant frequencies. Several nonlinear auditory transforms were evaluated. Unlike previous studies using a linear discriminant classifier, there was no advantage in category separability for any of the nonlinear auditory transforms over a linear frequency scale, and no advantage for spectral distances over absolute frequencies. However, it was found that parameter sets using nonlinear transforms and spectral differences reduced the differences between phonetically equivalent tokens produced by different groups of talkers.

Download Full-text

Fast Track: fast (nearly) automatic formant-tracking using Praat

Linguistics Vanguard ◽

10.1515/lingvan-2020-0051 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Santiago Barreda

Keyword(s):

Fundamental Frequency ◽

Fast Track ◽

Aggregate Data ◽

Time Varying ◽

Tracking Errors ◽

Mean Values ◽

Formant Frequencies ◽

Log Files ◽

Formant Tracking ◽

Rich Information

Abstract Fast Track is a formant tracker implemented in Praat that attempts to automatically select the best analysis from a set of candidates. The best track is selected by modeling smooth formant contours across the entirety of the sound, providing the researcher with rich information about static and dynamic formant properties. Fast Track returns text files containing acoustic information (formant frequencies, formant bandwidths, fundamental frequency, etc.) sampled every 2 ms, generates images showing the winning analysis and comparing alternate analyses, and creates log files detailing analysis information for each file. Fast Track features a modular workflow that allows for analysis steps to be run (and re-run) independently as necessary, and is designed to allow for easy correction of tracking errors by allowing the user to override the automatic analysis, or manually edit tracks where necessary. In addition, Fast Track includes tools to aggregate data across tokens, and to easily create vowel plots of mean values or time-varying formant contours. The design and use of Fast Track are outlined using a re-analysis of the Hillenbrand et al. (1995) dataset, which suggests that Fast Track can be very accurate in cases where signal properties allow for reliable formant estimates.

Download Full-text