sentence recognition Latest Research Papers

Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Trends in Hearing ◽

10.1177/23312165211068629 ◽

2022 ◽

Vol 26 ◽

pp. 233121652110686

Author(s):

Tim Green ◽

Gaston Hilkhuysen ◽

Mark Huckvale ◽

Stuart Rosen ◽

Mike Brookes ◽

...

Keyword(s):

Speech Enhancement ◽

Hearing Aids ◽

Spatial Information ◽

Signal To Noise Ratio ◽

Target Position ◽

Spatial Diversity ◽

Processing Scheme ◽

Time Frequency ◽

Sentence Recognition ◽

Enhancement Method

A signal processing approach combining beamforming with mask-informed speech enhancement was assessed by measuring sentence recognition in listeners with mild-to-moderate hearing impairment in adverse listening conditions that simulated the output of behind-the-ear hearing aids in a noisy classroom. Two types of beamforming were compared: binaural, with the two microphones of each aid treated as a single array, and bilateral, where independent left and right beamformers were derived. Binaural beamforming produces a narrower beam, maximising improvement in signal-to-noise ratio (SNR), but eliminates the spatial diversity that is preserved in bilateral beamforming. Each beamformer type was optimised for the true target position and implemented with and without additional speech enhancement in which spectral features extracted from the beamformer output were passed to a deep neural network trained to identify time-frequency regions dominated by target speech. Additional conditions comprising binaural beamforming combined with speech enhancement implemented using Wiener filtering or modulation-domain Kalman filtering were tested in normally-hearing (NH) listeners. Both beamformer types gave substantial improvements relative to no processing, with significantly greater benefit for binaural beamforming. Performance with additional mask-informed enhancement was poorer than with beamforming alone, for both beamformer types and both listener groups. In NH listeners the addition of mask-informed enhancement produced significantly poorer performance than both other forms of enhancement, neither of which differed from the beamformer alone. In summary, the additional improvement in SNR provided by binaural beamforming appeared to outweigh loss of spatial information, while speech understanding was not further improved by the mask-informed enhancement method implemented here.

Download Full-text

Rate discrimination training may partially restore temporal processing abilities from age-related deficits

10.1101/2021.11.29.21266998 ◽

2021 ◽

Author(s):

Samira Anderson ◽

Lindsay DeVries ◽

Edward Wilson Smith ◽

Matthew J Goupell ◽

Sandra Gordon-Salant

Keyword(s):

Discrimination Training ◽

Active Control ◽

Temporal Processing ◽

Training Group ◽

Normal Hearing ◽

Auditory Temporal Processing ◽

Pulse Trains ◽

Sentence Recognition ◽

Age Related ◽

Rate Discrimination

The ability to understand speech in complex environments depends on the ability of the brain to preserve the precise timing characteristics of the speech signal. Age-related declines in temporal processing may contribute to the communication difficulties in challenging listening conditions experienced by older adults. The study purpose was to evaluate the effects of rate discrimination training on auditory temporal processing. A double-blind, randomized control design assigned 77 young normal-hearing, older normal-hearing, and older hearing-impaired listeners to one of two treatment groups: experimental (rate discrimination for 100-Hz and 300-Hz pulse trains) and active control (tone detection in noise). All listeners were evaluated during pre- and post-training sessions using perceptual rate discrimination of 100-, 200-, 300-, and 400-Hz band-limited pulse trains and auditory steady-state responses (ASSRs) to the same stimuli. Training generalization was evaluated using several temporal processing measures and sentence recognition tests that included time-compressed and reverberant speech stimuli. Results demonstrated a session x training group interaction for perceptual and ASSR testing to the trained frequencies (100 and 300 Hz), driven by greater improvements in the training group than in the active control group. Further, post-test rate discrimination of the older listeners reached levels that were equivalent to those of the younger listeners at pre-test. The training-specific gains generalized to untrained frequencies (200 and 400 Hz), but not to other temporal processing or sentence recognition measures. Further, non-auditory inhibition/attention performance predicted training-related improvement in rate discrimination. Overall, the results demonstrate the potential for auditory training to partially restore temporal processing in older listeners and highlight the role of cognitive function in these gains.

Download Full-text

Reading aloud in clear speech reduces sentence recognition memory and recall for native and non-native talkers

The Journal of the Acoustical Society of America ◽

10.1121/10.0006732 ◽

2021 ◽

Vol 150 (5) ◽

pp. 3387-3398

Author(s):

Sandie Keerstock ◽

Rajka Smiljanic

Keyword(s):

Recognition Memory ◽

Reading Aloud ◽

Clear Speech ◽

Sentence Recognition ◽

Memory And Recall

Download Full-text

Sentence recognition in individuals with history of multiple concussions while listening to accented speech in noise

The Journal of the Acoustical Society of America ◽

10.1121/10.0007635 ◽

2021 ◽

Vol 150 (4) ◽

pp. A64-A64

Author(s):

Madison Buntrock

Keyword(s):

Sentence Recognition ◽

Accented Speech ◽

Speech In Noise ◽

History Of

Download Full-text

Amplitude modulation of background noise varies listeners’ spectral weights for sentence recognition

The Journal of the Acoustical Society of America ◽

10.1121/10.0008271 ◽

2021 ◽

Vol 150 (4) ◽

pp. A274-A274

Author(s):

Yi Shen ◽

Lauren Langley

Keyword(s):

Amplitude Modulation ◽

Background Noise ◽

Sentence Recognition

Download Full-text

Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning

10.1109/icip42928.2021.9506688 ◽

2021 ◽

Author(s):

Yi Zheng ◽

Qitong Wang ◽

Margrit Betke

Keyword(s):

Deep Learning ◽

Sentence Recognition

Download Full-text

Talker Adaptation and Lexical Difficulty Impact Word Recognition in Adults with Cochlear Implants

Audiology and Neurotology ◽

10.1159/000518643 ◽

2021 ◽

pp. 1-10

Author(s):

Terrin N. Tamati ◽

Aaron C. Moberly

Keyword(s):

Word Recognition ◽

Real World ◽

Temporal Processing ◽

Recognition Accuracy ◽

Recognition Task ◽

Individual Performance ◽

Auditory Sensitivity ◽

High Performing ◽

Sentence Recognition ◽

Talker Adaptation

Introduction: Talker-specific adaptation facilitates speech recognition in normal-hearing listeners. This study examined talker adaptation in adult cochlear implant (CI) users. Three hypotheses were tested: (1) high-performing adult CI users show improved word recognition following exposure to a talker (“talker adaptation”), particularly for lexically hard words, (2) individual performance is determined by auditory sensitivity and neurocognitive skills, and (3) individual performance relates to real-world functioning. Methods: Fifteen high-performing, post-lingually deaf adult CI users completed a word recognition task consisting of 6 single-talker blocks (3 female/3 male native English speakers); words were lexically “easy” and “hard.” Recognition accuracy was assessed “early” and “late” (first vs. last 10 trials); adaptation was assessed as the difference between late and early accuracy. Participants also completed measures of spectral-temporal processing and neurocognitive skills, as well as real-world measures of multiple-talker sentence recognition and quality of life (QoL). Results: CI users showed limited talker adaptation overall, but performance improved for lexically hard words. Stronger spectral-temporal processing and neurocognitive skills were weakly to moderately associated with more accurate word recognition and greater talker adaptation for hard words. Finally, word recognition accuracy for hard words was moderately related to multiple-talker sentence recognition and QoL. Conclusion: Findings demonstrate a limited talker adaptation benefit for recognition of hard words in adult CI users. Both auditory sensitivity and neurocognitive skills contribute to performance, suggesting additional benefit from adaptation for individuals with stronger skills. Finally, processing differences related to talker adaptation and lexical difficulty may be relevant to real-world functioning.

Download Full-text

AI enabled sign language recognition and VR space bidirectional communication using triboelectric smart glove

Nature Communications ◽

10.1038/s41467-021-25637-w ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 2

Author(s):

Feng Wen ◽

Zixuan Zhang ◽

Tianyiyi He ◽

Chengkuo Lee

Keyword(s):

Deep Learning ◽

Sign Language ◽

Learning Model ◽

Virtual Space ◽

New Order ◽

Language Recognition ◽

Sign Language Recognition ◽

Sentence Recognition ◽

Bidirectional Communication ◽

Deep Learning Model

AbstractSign language recognition, especially the sentence recognition, is of great significance for lowering the communication barrier between the hearing/speech impaired and the non-signers. The general glove solutions, which are employed to detect motions of our dexterous hands, only achieve recognizing discrete single gestures (i.e., numbers, letters, or words) instead of sentences, far from satisfying the meet of the signers’ daily communication. Here, we propose an artificial intelligence enabled sign language recognition and communication system comprising sensing gloves, deep learning block, and virtual reality interface. Non-segmentation and segmentation assisted deep learning model achieves the recognition of 50 words and 20 sentences. Significantly, the segmentation approach splits entire sentence signals into word units. Then the deep learning model recognizes all word elements and reversely reconstructs and recognizes sentences. Furthermore, new/never-seen sentences created by new-order word elements recombination can be recognized with an average correct rate of 86.67%. Finally, the sign language recognition results are projected into virtual space and translated into text and audio, allowing the remote and bidirectional communication between signers and non-signers.

Download Full-text

The Impact of Neurocognitive Skills on Recognition of Spectrally Degraded Sentences

Journal of the American Academy of Audiology ◽

10.1055/s-0041-1732438 ◽

2021 ◽

Vol 32 (08) ◽

pp. 528-536

Author(s):

Jessica H. Lewis ◽

Irina Castellanos ◽

Aaron C. Moberly

Keyword(s):

Speech Recognition ◽

Young Adult ◽

Temporal Processing ◽

Recognition Performance ◽

Channel Noise ◽

Sentence Recognition ◽

Inhibition Concentration ◽

Spectral Degradation ◽

The Impact ◽

Clinical Populations

Abstract Background Recent models theorize that neurocognitive resources are deployed differently during speech recognition depending on task demands, such as the severity of degradation of the signal or modality (auditory vs. audiovisual [AV]). This concept is particularly relevant to the adult cochlear implant (CI) population, considering the large amount of variability among CI users in their spectro-temporal processing abilities. However, disentangling the effects of individual differences in spectro-temporal processing and neurocognitive skills on speech recognition in clinical populations of adult CI users is challenging. Thus, this study investigated the relationship between neurocognitive functions and recognition of spectrally degraded speech in a group of young adult normal-hearing (NH) listeners. Purpose The aim of this study was to manipulate the degree of spectral degradation and modality of speech presented to young adult NH listeners to determine whether deployment of neurocognitive skills would be affected. Research Design Correlational study design. Study Sample Twenty-one NH college students. Data Collection and Analysis Participants listened to sentences in three spectral-degradation conditions: no degradation (clear sentences); moderate degradation (8-channel noise-vocoded); and high degradation (4-channel noise-vocoded). Thirty sentences were presented in an auditory-only (A-only) modality and an AV fashion. Visual assessments from The National Institute of Health Toolbox Cognitive Battery were completed to evaluate working memory, inhibition-concentration, cognitive flexibility, and processing speed. Analyses of variance compared speech recognition performance among spectral degradation condition and modality. Bivariate correlation analyses were performed among speech recognition performance and the neurocognitive skills in the various test conditions. Results Main effects on sentence recognition were found for degree of degradation (p = < 0.001) and modality (p = < 0.001). Inhibition-concentration skills moderately correlated (r = 0.45, p = 0.02) with recognition scores for sentences that were moderately degraded in the A-only condition. No correlations were found among neurocognitive scores and AV speech recognition scores. Conclusions Inhibition-concentration skills are deployed differentially during sentence recognition, depending on the level of signal degradation. Additional studies will be required to study these relations in actual clinical populations such as adult CI users.

Download Full-text

Effects of Adaptive Non-linear Frequency Compression in Hearing Aids on Mandarin Speech and Sound-Quality Perception

Frontiers in Neuroscience ◽

10.3389/fnins.2021.722970 ◽

2021 ◽

Vol 15 ◽

Author(s):

Shuang Qi ◽

Xueqing Chen ◽

Jing Yang ◽

Xianhui Wang ◽

Xin Tian ◽

...

Keyword(s):

Hearing Loss ◽

Sensorineural Hearing Loss ◽

Hearing Aids ◽

Sound Quality ◽

Compression Algorithm ◽

Frequency Compression ◽

Quality Perception ◽

Linear Frequency ◽

Sentence Recognition ◽

Non Linear

ObjectiveThis study was aimed at examining the effects of an adaptive non-linear frequency compression algorithm implemented in hearing aids (i.e., SoundRecover2, or SR2) at different parameter settings and auditory acclimatization on speech and sound-quality perception in native Mandarin-speaking adult listeners with sensorineural hearing loss.DesignData consisted of participants’ unaided and aided hearing thresholds, Mandarin consonant and vowel recognition in quiet, and sentence recognition in noise, as well as sound-quality ratings through five sessions in a 12-week period with three SR2 settings (i.e., SR2 off, SR2 default, and SR2 strong).Study SampleTwenty-nine native Mandarin-speaking adults aged 37–76 years old with symmetric sloping moderate-to-profound sensorineural hearing loss were recruited. They were all fitted bilaterally with Phonak Naida V90-SP BTE hearing aids with hard ear-molds.ResultsThe participants demonstrated a significant improvement of aided hearing in detecting high frequency sounds at 8 kHz. For consonant recognition and overall sound-quality rating, the participants performed significantly better with the SR2 default setting than the other two settings. No significant differences were found in vowel and sentence recognition among the three SR2 settings. Test session was a significant factor that contributed to the participants’ performance in all speech and sound-quality perception tests. Specifically, the participants benefited from a longer duration of hearing aid use.ConclusionFindings from this study suggested possible perceptual benefit from the adaptive non-linear frequency compression algorithm for native Mandarin-speaking adults with moderate-to-profound hearing loss. Periods of acclimatization should be taken for better performance in novel technologies in hearing aids.

Download Full-text

sentence recognition
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Rate discrimination training may partially restore temporal processing abilities from age-related deficits

Reading aloud in clear speech reduces sentence recognition memory and recall for native and non-native talkers

Sentence recognition in individuals with history of multiple concussions while listening to accented speech in noise

Amplitude modulation of background noise varies listeners’ spectral weights for sentence recognition

Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning

Talker Adaptation and Lexical Difficulty Impact Word Recognition in Adults with Cochlear Implants

AI enabled sign language recognition and VR space bidirectional communication using triboelectric smart glove

The Impact of Neurocognitive Skills on Recognition of Spectrally Degraded Sentences

Effects of Adaptive Non-linear Frequency Compression in Hearing Aids on Mandarin Speech and Sound-Quality Perception

Export Citation Format

sentence recognitionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement

Rate discrimination training may partially restore temporal processing abilities from age-related deficits

Reading aloud in clear speech reduces sentence recognition memory and recall for native and non-native talkers

Sentence recognition in individuals with history of multiple concussions while listening to accented speech in noise

Amplitude modulation of background noise varies listeners’ spectral weights for sentence recognition

Semantic-Based Sentence Recognition in Images Using Bimodal Deep Learning

Talker Adaptation and Lexical Difficulty Impact Word Recognition in Adults with Cochlear Implants

AI enabled sign language recognition and VR space bidirectional communication using triboelectric smart glove

The Impact of Neurocognitive Skills on Recognition of Spectrally Degraded Sentences

Effects of Adaptive Non-linear Frequency Compression in Hearing Aids on Mandarin Speech and Sound-Quality Perception

sentence recognition
Recently Published Documents