New Method Boosts Speech Intelligibility in Noisy Environments

2017 ◽  
Vol 70 (12) ◽  
pp. 38 ◽  
Author(s):  
Peyman Goli ◽  
Mohammad Raofy
2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


2018 ◽  
Vol 37 (2) ◽  
pp. 159 ◽  
Author(s):  
Fatemeh Vakhshiteh ◽  
Farshad Almasganj ◽  
Ahmad Nickabadi

Lip-reading is typically known as visually interpreting the speaker's lip movements during speaking. Experiments over many years have revealed that speech intelligibility increases if visual facial information becomes available. This effect becomes more apparent in noisy environments. Taking steps toward automating this process, some challenges will be raised such as coarticulation phenomenon, visual units' type, features diversity and their inter-speaker dependency. While efforts have been made to overcome these challenges, presentation of a flawless lip-reading system is still under the investigations. This paper searches for a lipreading model with an efficiently developed incorporation and arrangement of processing blocks to extract highly discriminative visual features. Here, application of a properly structured Deep Belief Network (DBN)- based recognizer is highlighted. Multi-speaker (MS) and speaker-independent (SI) tasks are performed over CUAVE database, and phone recognition rates (PRRs) of 77.65% and 73.40% are achieved, respectively. The best word recognition rates (WRRs) achieved in the tasks of MS and SI are 80.25% and 76.91%, respectively. Resulted accuracies demonstrate that the proposed method outperforms the conventional Hidden Markov Model (HMM) and competes well with the state-of-the-art visual speech recognition works.


2015 ◽  
Vol 5 (2) ◽  
Author(s):  
Norbert Dillier ◽  
Wai Kong Lai

The Nucleus® 5 System Sound Processor (CP810, Cochlear™, Macquarie University, NSW, Australia) contains two omnidirectional microphones. They can be configured as a fixed directional microphone combination (called Zoom) or as an adaptive beamformer (called Beam), which adjusts the directivity continuously to maximally reduce the interfering noise. Initial evaluation studies with the CP810 had compared performance and usability of the new processor in comparison with the Freedom™ Sound Processor (Cochlear™) for speech in quiet and noise for a subset of the processing options. This study compares the two processing options suggested to be used in noisy environments, Zoom and Beam, for various sound field conditions using a standardized speech in noise matrix test (Oldenburg sentences test). Nine German-speaking subjects who previously had been using the Freedom speech processor and subsequently were upgraded to the CP810 device participated in this series of additional evaluation tests. The speech reception threshold (SRT for 50% speech intelligibility in noise) was determined using sentences presented via loudspeaker at 65 dB SPL in front of the listener and noise presented either via the same loudspeaker (S0N0) or at 90 degrees at either the ear with the sound processor (S0NCI+) or the opposite unaided ear (S0NCI-). The fourth noise condition consisted of three uncorrelated noise sources placed at 90, 180 and 270 degrees. The noise level was adjusted through an adaptive procedure to yield a signal to noise ratio where 50% of the words in the sentences were correctly understood. In spatially separated speech and noise conditions both Zoom and Beam could improve the SRT significantly. For single noise sources, either ipsilateral or contralateral to the cochlear implant sound processor, average improvements with Beam of 12.9 and 7.9 dB in SRT were found. The average SRT of –8 dB for Beam in the diffuse noise condition (uncorrelated noise from both sides and back) is truly remarkable and comparable to the performance of normal hearing listeners in the same test environment. The static directivity (Zoom) option in the diffuse noise condition still provides a significant benefit of 5.9 dB in comparison with the standard omnidirectional microphone setting. These results indicate that CI recipients may improve their speech recognition in noisy environments significantly using these directional microphone-processing options.


2021 ◽  
pp. 1-12
Author(s):  
Jie Wang ◽  
Linhuang Yan ◽  
Qiaohe Yang ◽  
Minmin Yuan

In this paper, a single-channel speech enhancement algorithm is proposed by using guided spectrogram filtering based on masking properties of human auditory system when considering a speech spectrogram as an image. Guided filtering is capable of sharpening details and estimating unwanted textures or background noise from the noisy speech spectrogram. If we consider the noisy spectrogram as a degraded image, we can estimate the spectrogram of the clean speech signal using guided filtering after subtracting noise components. Combined with masking properties of human auditory system, the proposed algorithm adaptively adjusts and reduces the residual noise of the enhanced speech spectrogram according to the corresponding masking threshold. Because the filtering output is a local linear transform of the guidance spectrogram, the local mask window slides can be efficiently implemented via box filter with O(N) computational complexity. Experimental results show that the proposed algorithm can effectively suppress noise in different noisy environments and thus can greatly improve speech quality and speech intelligibility.


2021 ◽  
Vol 42 (03) ◽  
pp. 260-281
Author(s):  
Asger Heidemann Andersen ◽  
Sébastien Santurette ◽  
Michael Syskind Pedersen ◽  
Emina Alickovic ◽  
Lorenz Fiedler ◽  
...  

AbstractHearing aids continue to acquire increasingly sophisticated sound-processing features beyond basic amplification. On the one hand, these have the potential to add user benefit and allow for personalization. On the other hand, if such features are to benefit according to their potential, they require clinicians to be acquainted with both the underlying technologies and the specific fitting handles made available by the individual hearing aid manufacturers. Ensuring benefit from hearing aids in typical daily listening environments requires that the hearing aids handle sounds that interfere with communication, generically referred to as “noise.” With this aim, considerable efforts from both academia and industry have led to increasingly advanced algorithms that handle noise, typically using the principles of directional processing and postfiltering. This article provides an overview of the techniques used for noise reduction in modern hearing aids. First, classical techniques are covered as they are used in modern hearing aids. The discussion then shifts to how deep learning, a subfield of artificial intelligence, provides a radically different way of solving the noise problem. Finally, the results of several experiments are used to showcase the benefits of recent algorithmic advances in terms of signal-to-noise ratio, speech intelligibility, selective attention, and listening effort.


1999 ◽  
Vol 38 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Jan Wouters ◽  
Luc Litière ◽  
Astrid van Wieringen

Sign in / Sign up

Export Citation Format

Share Document