scholarly journals Wiener Gain and Deep Neural Networks: A Well-Balanced Pair For Speech Enhancement

Author(s):  
Dayana Ribas ◽  
Antonio Miguel ◽  
Alfonso Ortega ◽  
Eduardo Lleida

Abstract This paper proposes a Deep Neural Network (DNN)-based Wiener gain estimator for speech enhancement. The proposal is in the framework of the classical spectral-domain speech enhancement algorithms. In this case, we used the Optimal Modified Log-Spectral Amplitude (OMLSA), but consider that this proposal could fit many alternative speech estimation algorithms. We determined the best usage of the DNN approach at learning a robust instance of the Wiener gain estimator according to the characteristics of the SNR estimation and the gain function. To design a DNN architecture adjusted for the speech enhancement task, we study various configuration issues frequently used in DNN-based solutions, including speech representations, residual connections, and causal vs. non-causal designs. Thus, we provide conclusions for the use of DNN architectures with the enhancement purpose. Experiments show that the proposal provides results on the state-of-the-art. But beyond the objective quality measures, there are examples of noisy vs. enhanced speech available for listening to demonstrate in practice the skills of the method in real audio.

Author(s):  
Dong-Dong Chen ◽  
Wei Wang ◽  
Wei Gao ◽  
Zhi-Hua Zhou

Deep neural networks have witnessed great successes in various real applications, but it requires a large number of labeled data for training. In this paper, we propose tri-net, a deep neural network which is able to use massive unlabeled data to help learning with limited labeled data. We consider model initialization, diversity augmentation and pseudo-label editing simultaneously. In our work, we utilize output smearing to initialize modules, use fine-tuning on labeled data to augment diversity and eliminate unstable pseudo-labels to alleviate the influence of suspicious pseudo-labeled data. Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods. In particular, it achieves 8.30% error rate on CIFAR-10 by using only 4000 labeled examples.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 1058-1062

This paper presents a method for speech enhancement to predict speech quality in presence of highly non-stationary scenarios using basic wiener filtering in frequency domain with an adaptive gain function under eight different noises at three different ranges of input SNR. Its performance is evaluated in terms of objective quality measures like LPC based spectral distortion measures are Cepstrum Distance, Itakura Saito and Log Likelihood Ratio. This method was tested using Noizeous database, its performance measures were compared against spectral subtractive type algorithms and it shows its improvements in terms of objective quality measures.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Florian Stelzer ◽  
André Röhm ◽  
Raul Vicente ◽  
Ingo Fischer ◽  
Serhiy Yanchuk

AbstractDeep neural networks are among the most widely applied machine learning tools showing outstanding performance in a broad range of tasks. We present a method for folding a deep neural network of arbitrary size into a single neuron with multiple time-delayed feedback loops. This single-neuron deep neural network comprises only a single nonlinearity and appropriately adjusted modulations of the feedback signals. The network states emerge in time as a temporal unfolding of the neuron’s dynamics. By adjusting the feedback-modulation within the loops, we adapt the network’s connection weights. These connection weights are determined via a back-propagation algorithm, where both the delay-induced and local network connections must be taken into account. Our approach can fully represent standard Deep Neural Networks (DNN), encompasses sparse DNNs, and extends the DNN concept toward dynamical systems implementations. The new method, which we call Folded-in-time DNN (Fit-DNN), exhibits promising performance in a set of benchmark tasks.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Mohammed Aliy Mohammed ◽  
Fetulhak Abdurahman ◽  
Yodit Abebe Ayalew

Abstract Background Automating cytology-based cervical cancer screening could alleviate the shortage of skilled pathologists in developing countries. Up until now, computer vision experts have attempted numerous semi and fully automated approaches to address the need. Yet, these days, leveraging the astonishing accuracy and reproducibility of deep neural networks has become common among computer vision experts. In this regard, the purpose of this study is to classify single-cell Pap smear (cytology) images using pre-trained deep convolutional neural network (DCNN) image classifiers. We have fine-tuned the top ten pre-trained DCNN image classifiers and evaluated them using five class single-cell Pap smear images from SIPaKMeD dataset. The pre-trained DCNN image classifiers were selected from Keras Applications based on their top 1% accuracy. Results Our experimental result demonstrated that from the selected top-ten pre-trained DCNN image classifiers DenseNet169 outperformed with an average accuracy, precision, recall, and F1-score of 0.990, 0.974, 0.974, and 0.974, respectively. Moreover, it dashed the benchmark accuracy proposed by the creators of the dataset with 3.70%. Conclusions Even though the size of DenseNet169 is small compared to the experimented pre-trained DCNN image classifiers, yet, it is not suitable for mobile or edge devices. Further experimentation with mobile or small-size DCNN image classifiers is required to extend the applicability of the models in real-world demands. In addition, since all experiments used the SIPaKMeD dataset, additional experiments will be needed using new datasets to enhance the generalizability of the models.


Author(s):  
Yunfei Fu ◽  
Hongchuan Yu ◽  
Chih-Kuo Yeh ◽  
Tong-Yee Lee ◽  
Jian J. Zhang

Brushstrokes are viewed as the artist’s “handwriting” in a painting. In many applications such as style learning and transfer, mimicking painting, and painting authentication, it is highly desired to quantitatively and accurately identify brushstroke characteristics from old masters’ pieces using computer programs. However, due to the nature of hundreds or thousands of intermingling brushstrokes in the painting, it still remains challenging. This article proposes an efficient algorithm for brush Stroke extraction based on a Deep neural network, i.e., DStroke. Compared to the state-of-the-art research, the main merit of the proposed DStroke is to automatically and rapidly extract brushstrokes from a painting without manual annotation, while accurately approximating the real brushstrokes with high reliability. Herein, recovering the faithful soft transitions between brushstrokes is often ignored by the other methods. In fact, the details of brushstrokes in a master piece of painting (e.g., shapes, colors, texture, overlaps) are highly desired by artists since they hold promise to enhance and extend the artists’ powers, just like microscopes extend biologists’ powers. To demonstrate the high efficiency of the proposed DStroke, we perform it on a set of real scans of paintings and a set of synthetic paintings, respectively. Experiments show that the proposed DStroke is noticeably faster and more accurate at identifying and extracting brushstrokes, outperforming the other methods.


2021 ◽  
Author(s):  
Luke Gundry ◽  
Gareth Kennedy ◽  
Alan Bond ◽  
Jie Zhang

The use of Deep Neural Networks (DNNs) for the classification of electrochemical mechanisms based on training with simulations of the initial cycle of potential have been reported. In this paper,...


2021 ◽  
pp. 1-15
Author(s):  
Wenjun Tan ◽  
Luyu Zhou ◽  
Xiaoshuo Li ◽  
Xiaoyu Yang ◽  
Yufei Chen ◽  
...  

BACKGROUND: The distribution of pulmonary vessels in computed tomography (CT) and computed tomography angiography (CTA) images of lung is important for diagnosing disease, formulating surgical plans and pulmonary research. PURPOSE: Based on the pulmonary vascular segmentation task of International Symposium on Image Computing and Digital Medicine 2020 challenge, this paper reviews 12 different pulmonary vascular segmentation algorithms of lung CT and CTA images and then objectively evaluates and compares their performances. METHODS: First, we present the annotated reference dataset of lung CT and CTA images. A subset of the dataset consisting 7,307 slices for training and 3,888 slices for testing was made available for participants. Second, by analyzing the performance comparison of different convolutional neural networks from 12 different institutions for pulmonary vascular segmentation, the reasons for some defects and improvements are summarized. The models are mainly based on U-Net, Attention, GAN, and multi-scale fusion network. The performance is measured in terms of Dice coefficient, over segmentation ratio and under segmentation rate. Finally, we discuss several proposed methods to improve the pulmonary vessel segmentation results using deep neural networks. RESULTS: By comparing with the annotated ground truth from both lung CT and CTA images, most of 12 deep neural network algorithms do an admirable job in pulmonary vascular extraction and segmentation with the dice coefficients ranging from 0.70 to 0.85. The dice coefficients for the top three algorithms are about 0.80. CONCLUSIONS: Study results show that integrating methods that consider spatial information, fuse multi-scale feature map, or have an excellent post-processing to deep neural network training and optimization process are significant for further improving the accuracy of pulmonary vascular segmentation.


Sign in / Sign up

Export Citation Format

Share Document