speaker variability
Recently Published Documents


TOTAL DOCUMENTS

121
(FIVE YEARS 25)

H-INDEX

12
(FIVE YEARS 3)

Author(s):  
Inga-Lena Johansson ◽  
Christina Samuelsson ◽  
Nicole Müller

Introduction: Assessment of intelligibility in dysarthria tends to rely on oral reading of sentences or words. However, self-generated utterances are closer to a clients’ natural speech. This study investigated how transcription of utterances elicited by picture description can be used in the assessment of intelligibility in speakers with Parkinson’s disease. Methods: Speech samples from eleven speakers with Parkinson’s disease and six neurologically healthy persons were audio-recorded. Forty-two naive listeners completed transcriptions of self-generated sentences from a picture description task and orally read sentences from the Swedish Test of Intelligibility, as well as scaled ratings of narrative speech samples. Results: Intelligibility was higher in orally read than self-generated sentences and higher for content words than for the whole sentence in self-generated sentences for most of the speakers, although these within-group differences were not statistically significant at group level. Adding contextual leads for the listeners increased intelligibility in self-generated utterances significantly, but with individual variation. Although correlations between the intelligibility measures were at least moderate or strong, there was a considerable inter- and intra-speaker variability in intelligibility scores between tasks for the speakers with Parkinson’s disease, indicating individual variation of factors that impact intelligibility. Intelligibility scores from neurologically healthy speakers were generally high across tasks with no significant differences between the conditions. Discussion/Conclusion: Within-speaker variability support literature recommendations to use multiple methods and tasks when assessing intelligibility. The inclusion of transcription of self-generated utterances elicited by picture description to the intelligibility assessment has the potential to provide additional information to assessment methods based on oral reading of pre-scripted sentences, and to inform the planning of interventions.


2021 ◽  
pp. 002383092110460
Author(s):  
Martin Ho Kwan Ip ◽  
Anne Cutler

Many different prosodic cues can help listeners predict upcoming speech. However, no research to date has assessed listeners’ processing of preceding prosody from different speakers. The present experiments examine (1) whether individual speakers (of the same language variety) are likely to vary in their production of preceding prosody; (2) to the extent that there is talker variability, whether listeners are flexible enough to use any prosodic cues signaled by the individual speaker; and (3) whether types of prosodic cues (e.g., F0 versus duration) vary in informativeness. Using a phoneme-detection task, we examined whether listeners can entrain to different combinations of preceding prosodic cues to predict where focus will fall in an utterance. We used unsynthesized sentences recorded by four female native speakers of Australian English who happened to have used different preceding cues to produce sentences with prosodic focus: a combination of pre-focus overall duration cues, F0 and intensity (mean, maximum, range), and longer pre-target interval before the focused word onset (Speaker 1), only mean F0 cues, mean and maximum intensity, and longer pre-target interval (Speaker 2), only pre-target interval duration (Speaker 3), and only pre-focus overall duration and maximum intensity (Speaker 4). Listeners could entrain to almost every speaker’s cues (the exception being Speaker 4’s use of only pre-focus overall duration and maximum intensity), and could use whatever cues were available even when one of the cue sources was rendered uninformative. Our findings demonstrate both speaker variability and listener flexibility in the processing of prosodic focus.


2021 ◽  
Author(s):  
Alla Menshikova ◽  
Daniil Kocharov ◽  
Tatiana Kachkovskaia
Keyword(s):  

2021 ◽  
Vol 6 ◽  
Author(s):  
Murray J. Munro

Hierarchies of difficulty in second-language (L2) phonology have long played a role in the postulation and evaluation of learning models. In L2 pronunciation teaching, hierarchies are assumed to be helpful in the development of instructional strategies based on anticipated areas of difficulty. This investigation addressed the practicality of defining a pedagogically useful hierarchy of difficulty for English tense and lax close vowels (/i I u ʊ/) produced by Cantonese speakers. Unlike their English counterparts, Cantonese close tense-lax pairs are allophonic variants with [i u] occurring before alveolars and [I ʊ] before velars. Each tense-lax pair represents a “phonemic split” in which members of a single L1 category are realized contrastively in L2. Despite evidence that English tense-lax distinctions are challenging for Cantonese speakers, no previous empirical work has closely considered the problem from the standpoint of vowel intelligibility across multiple phonetic contexts and in different words sharing the same rhyme. In a picture-based word-elicitation task, 18 Cantonese-speaking participants produced 31 high-frequency CV and CVC words. Vowels were evaluated for intelligibility by phonetically-trained judges. A series of mixed-effects binary logistic models were fitted to the scores, with vowel quality, phonetic context (rhyme) and word as factors, and length of Canadian residence and daily use of English as co-variates. As expected, the general hierarchy of difficulty for vowels that emerged (/i/ > /u/ > /ʊ/ > /I/) was complicated by large differences across phonetic contexts. Results were not readily explicable in terms of transfer; moreover, different words with the same rhyme were not produced with equal intelligibility. The most serious modeling complication was the sizeable inter-speaker variability in difficulties, which could not be accounted for by model co-variates. Although some difficulties were roughly systematic at the group level, it is argued that establishing a pedagogically useful hierarchy on such data would prove intractable. Rather, L2 learners might be better served by assessment and instructional targeting of their individual problem areas than by a focus on errors predicted from hierarchies of difficulty.


Author(s):  
Constantijn Kaland

ABSTRACT This paper reports an automatic data-driven analysis for describing prototypical intonation patterns, particularly suitable for initial stages of prosodic research and language description. The approach has several advantages over traditional ways to investigate intonation, such as the applicability to spontaneous speech, language- and domain-independency, and the potential of revealing meaningful functions of intonation. These features make the approach particularly useful for language documentation, where the description of prosody is often lacking. The core of this approach is a cluster analysis on a time-series of f0 measurements and consists of two scripts (Praat and R, available from https://constantijnkaland.github.io/contourclustering/). Graphical user interfaces can be used to perform the analyses on collected data ranging from spontaneous to highly controlled speech. There is limited need for manual annotation prior to analysis and speaker variability can be accounted for. After cluster analysis, Praat textgrids can be generated with the cluster number annotated for each individual contour. Although further confirmatory analysis is still required, the outcomes provide useful and unbiased directions for any investigation of prototypical f0 contours based on their acoustic form.


2021 ◽  
Vol 11 (2) ◽  
pp. 177
Author(s):  
Mihye Choi ◽  
Mohinish Shukla

Speech is an acoustically variable signal, and one of the sources of this variation is the presence of multiple speakers. Empirical evidence has suggested that adult listeners possess remarkably sensitive (and systematic) abilities to process speech signals, despite speaker variability. It includes not only a sensitivity to speaker-specific variation, but also an ability to utilize speaker variation with other sources of information for further processing. Recently, many studies also showed that young children seem to possess a similar capacity. This suggests continuity in the processing of speaker-dependent speech variability, and suggests that this ability could also be important for infants learning their native language. In the present paper, we review evidence for speaker variability and speech processing in adults, and speaker variability and speech processing in young children, with an emphasis on how they make use of speaker-specific information in word learning situations. Finally, we will build on these findings to make a novel proposal for the use of speaker-specific information processing in phoneme learning in infancy.


Sign in / Sign up

Export Citation Format

Share Document