spectral subtraction
Recently Published Documents


TOTAL DOCUMENTS

479
(FIVE YEARS 59)

H-INDEX

28
(FIVE YEARS 2)

Author(s):  
Nesrine Abajaddi ◽  
Badia Mounir ◽  
Laila Elmaazouzi ◽  
Ilham Mounir ◽  
Abdelmajid Farchi

2021 ◽  
Author(s):  
Kun Liao

Due to the shortcomings of acoustic feature parameters in speech signals, and the limitations of existing acoustic features in characterizing the integrity of the speech information, This paper proposes a method for speech recognition combining cochlear feature and random forest. Environmental noise can pose a threat to the stable operation of current speech recognition systems. It is therefore essential to develop robust systems that are able to identify speech under low signal-to-noise ratio. In this paper, we propose a method of speech recognition combining spectral subtraction, auditory and energy features extraction. This method first extract novel auditory features based on cochlear filter cepstral coefficients (CFCC) and instantaneous frequency (IF), i.e., CFCCIF. Spectral subtraction is then introduced into the front end of feature extraction, and the extracted feature is called enhanced auditory features (EAF). An energy feature Teager energy operator (TEO) is also extracted, the combination of them is known as a fusion feature. Linear discriminate analysis (LDA) is then applied to feature selection and optimization of the fusion feature. Finally, random forest (RF) is used as the classifier in a non-specific persons, isolated words, and small-vocabulary speech recognition system. On the Korean isolated words database, the proposed features (i.e., EAF) after fusion with Teager energy features have shown strong robustness in the nosiy situation. Our experiments show that the optimization feature achieved in a speech recognition task display a high recognition rate and excellent anti-noise performance.


2021 ◽  
Vol 15 (4) ◽  
pp. 8480-8489
Author(s):  
Che Ku Eddy Nizwan Che Ku Husin ◽  
Mohd Fairusham Ghazali ◽  
Ahmad Razlan Yusoff

In modal analysis, measurement of input force and vibration response are crucial to accurately measure the transfer function of the structure. However, under operating condition, the force induced by operating machinery is impossible to be measured due to the sensor placement issue. In this case, the ambient response induced by the operating force should be suppressed to minimize the error in the Frequency Response Function (FRF) calculation. This paper presents the utilization of a modified spectral subtraction filter for ambient suppression. The introduction of effective ambient magnitude in gain function calculation has increased the efficiency of spectral subtraction filter. This parameter is calculated based on the phase information of the reconstructed artificial ambient response. The measurement using EMA was carried out on a motor-driven structure to verify the proposed technique. Two sets of data under shutdown and running condition were recorded to observe the effect of ambient operating force. Under the operating condition, the measured FRF show the non-identical features at operating frequencies as compared to the baseline data. The utilization of filtering process shows the ambient features contained in the transfer function was effectively suppressed. The output of filtering algorithm could provide an alternative option to perform EMA procedure under running condition.


Sensors ◽  
2021 ◽  
Vol 21 (21) ◽  
pp. 7025
Author(s):  
Jenifa Gnanamanickam ◽  
Yuvaraj Natarajan ◽  
Sri Preethaa K. R.

In recent years, speech recognition technology has become a more common notion. Speech quality and intelligibility are critical for the convenience and accuracy of information transmission in speech recognition. The speech processing systems used to converse or store speech are usually designed for an environment without any background noise. However, in a real-world atmosphere, background intervention in the form of background noise and channel noise drastically reduces the performance of speech recognition systems, resulting in imprecise information transfer and exhausting the listener. When communication systems’ input or output signals are affected by noise, speech enhancement techniques try to improve their performance. To ensure the correctness of the text produced from speech, it is necessary to reduce the external noises involved in the speech audio. Reducing the external noise in audio is difficult as the speech can be of single, continuous or spontaneous words. In automatic speech recognition, there are various typical speech enhancement algorithms available that have gained considerable attention. However, these enhancement algorithms work well in simple and continuous audio signals only. Thus, in this study, a hybridized speech recognition algorithm to enhance the speech recognition accuracy is proposed. Non-linear spectral subtraction, a well-known speech enhancement algorithm, is optimized with the Hidden Markov Model and tested with 6660 medical speech transcription audio files and 1440 Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) audio files. The performance of the proposed model is compared with those of various typical speech enhancement algorithms, such as iterative signal enhancement algorithm, subspace-based speech enhancement, and non-linear spectral subtraction. The proposed cascaded hybrid algorithm was found to achieve a minimum word error rate of 9.5% and 7.6% for medical speech and RAVDESS speech, respectively. The cascading of the speech enhancement and speech-to-text conversion architectures results in higher accuracy for enhanced speech recognition. The evaluation results confirm the incorporation of the proposed method with real-time automatic speech recognition medical applications where the complexity of terms involved is high.


2021 ◽  
pp. 2250008
Author(s):  
N. Radha ◽  
R. B. Jananie ◽  
A. Anto Silviya

Speech processing is an important application area of digital signal processing that helps examine and analyze the speech signal. In this processing, speech enhancement is an essential factor because it improves the quality of the signal that helps resolve the communication challenges. Different speech enhancement algorithms are utilized in the research field, but limited processing capabilities, maximum microphone distance, and voice-first I.O. interfaces create the computation complexity. In this paper, speech enhancement is done in two steps. In an initial step, spectral subtraction method is applied to LJ Speech dataset. In the first stage, noise spectrum is estimated during pauses and it is subtracted from the noisy speech signal to obtain the clean speech signal. However, spectral subtraction method still introduces artificial noise and narrow-band noise in the spectrum. Hence, artificial bandwidth expansion with a deep shallow convolution neural network (ABE-DSCNN) is implemented as a second stage in the paper. Further, developed system is compared with conventional enhancement approaches such as deep learning network (DNN), neural beam forming (NB) and generative adversarial network (GAN). The experimental results show that an ABS-DSCNN provides 4% increase of PSEQ and error rate improved by 40% to 56% with respect to the other existing algorithms for 1000 speech samples. Hence, the paper concludes that ABE-DSCNN approach effectively improves the speech quality.


Animals ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 2238
Author(s):  
Zhigang Sun ◽  
Mengmeng Gao ◽  
Guotao Wang ◽  
Bingze Lv ◽  
Cailing He ◽  
...  

Broiler sounds can provide feedback on their own body condition, to a certain extent. Aiming at the noise in the sound signals collected in broiler farms, research on evaluating the filtering methods for broiler sound signals from multiple perspectives is proposed, and the best performer can be obtained for broiler sound signal filtering. Multiple perspectives include the signal angle and the recognition angle, which are embodied in three indicators: signal-to-noise ratio (SNR), root mean square error (RMSE), and prediction accuracy. The signal filtering methods used in this study include Basic Spectral Subtraction, Improved Spectral Subtraction based on multi-taper spectrum estimation, Wiener filtering and Sparse Decomposition using both thirty atoms and fifty atoms. In analysis of the signal angle, Improved Spectral Subtraction based on multi-taper spectrum estimation achieved the highest average SNR of 5.5145 and achieved the smallest average RMSE of 0.0508. In analysis of the recognition angle, the kNN classifier and Random Forest classifier achieved the highest average prediction accuracy on the data set established from the sound signals filtered by Wiener filtering, which were 88.83% and 88.69%, respectively. These are significantly higher than those obtained by classifiers on data sets established from sound signals filtered by other methods. Further research shows that after removing the starting noise in the sound signal, Wiener filtering achieved the highest average SNR of 5.6108 and a new RMSE of 0.0551. Finally, in comprehensive analysis of both the signal angle and the recognition angle, this research determined that Wiener filtering is the best broiler sound signal filtering method. This research lays the foundation for follow-up research on extracting classification features from high-quality broiler sound signals to realize broiler health monitoring. At the same time, the research results can be popularized and applied to studies on the detection and processing of livestock and poultry sound signals, which has extremely important reference and practical value.


2021 ◽  
Vol 11 (15) ◽  
pp. 6858
Author(s):  
Min Chen ◽  
Chang-Myung Lee

The generalized spectral subtraction algorithm (GBSS), which has extraordinary ability in background noise reduction, is historically one of the first approaches used for speech enhancement and dereverberation. However, the algorithm has not been applied to de-noise the room impulse response (RIR) to extend the reverberation decay range. The application of the GBSS algorithm in this study is stated as an optimization problem, that is, subtracting the noise level from the RIR while maintaining the signal quality. The optimization process conducted in the measurements of the RIRs with artificial noise and natural ambient noise aims to determine the optimal sets of factors to achieve the best noise reduction results regarding the largest dynamic range improvement. The optimal factors are set variables determined by the estimated SNRs of the RIRs filtered in the octave band. The acoustic parameters, the reverberation time (RT), and early decay time (EDT), and the dynamic range improvement of the energy decay curve were used as control measures and evaluation criteria to ensure the reliability of the algorithm. The de-noising results were compared with noise compensation methods. With the achieved optimal factors, the GBSS contributes to a significant effect in terms of dynamic range improvement and decreases the estimation errors in the RTs caused by noise levels.


Sign in / Sign up

Export Citation Format

Share Document