Audio fingerprint analysis for speech processing using deep learning method

Author(s):  
Ali Altalbe
2019 ◽  
Vol 9 (22) ◽  
pp. 4749
Author(s):  
Lingyun Jiang ◽  
Kai Qiao ◽  
Linyuan Wang ◽  
Chi Zhang ◽  
Jian Chen ◽  
...  

Decoding human brain activities, especially reconstructing human visual stimuli via functional magnetic resonance imaging (fMRI), has gained increasing attention in recent years. However, the high dimensionality and small quantity of fMRI data impose restrictions on satisfactory reconstruction, especially for the reconstruction method with deep learning requiring huge amounts of labelled samples. When compared with the deep learning method, humans can recognize a new image because our human visual system is naturally capable of extracting features from any object and comparing them. Inspired by this visual mechanism, we introduced the mechanism of comparison into deep learning method to realize better visual reconstruction by making full use of each sample and the relationship of the sample pair by learning to compare. In this way, we proposed a Siamese reconstruction network (SRN) method. By using the SRN, we improved upon the satisfying results on two fMRI recording datasets, providing 72.5% accuracy on the digit dataset and 44.6% accuracy on the character dataset. Essentially, this manner can increase the training data about from n samples to 2n sample pairs, which takes full advantage of the limited quantity of training samples. The SRN learns to converge sample pairs of the same class or disperse sample pairs of different class in feature space.


2021 ◽  
Author(s):  
Francesco Banterle ◽  
Rui Gong ◽  
Massimiliano Corsini ◽  
Fabio Ganovelli ◽  
Luc Van Gool ◽  
...  

Energies ◽  
2021 ◽  
Vol 14 (15) ◽  
pp. 4595
Author(s):  
Parisa Asadi ◽  
Lauren E. Beckingham

X-ray CT imaging provides a 3D view of a sample and is a powerful tool for investigating the internal features of porous rock. Reliable phase segmentation in these images is highly necessary but, like any other digital rock imaging technique, is time-consuming, labor-intensive, and subjective. Combining 3D X-ray CT imaging with machine learning methods that can simultaneously consider several extracted features in addition to color attenuation, is a promising and powerful method for reliable phase segmentation. Machine learning-based phase segmentation of X-ray CT images enables faster data collection and interpretation than traditional methods. This study investigates the performance of several filtering techniques with three machine learning methods and a deep learning method to assess the potential for reliable feature extraction and pixel-level phase segmentation of X-ray CT images. Features were first extracted from images using well-known filters and from the second convolutional layer of the pre-trained VGG16 architecture. Then, K-means clustering, Random Forest, and Feed Forward Artificial Neural Network methods, as well as the modified U-Net model, were applied to the extracted input features. The models’ performances were then compared and contrasted to determine the influence of the machine learning method and input features on reliable phase segmentation. The results showed considering more dimensionality has promising results and all classification algorithms result in high accuracy ranging from 0.87 to 0.94. Feature-based Random Forest demonstrated the best performance among the machine learning models, with an accuracy of 0.88 for Mancos and 0.94 for Marcellus. The U-Net model with the linear combination of focal and dice loss also performed well with an accuracy of 0.91 and 0.93 for Mancos and Marcellus, respectively. In general, considering more features provided promising and reliable segmentation results that are valuable for analyzing the composition of dense samples, such as shales, which are significant unconventional reservoirs in oil recovery.


2021 ◽  
Vol 11 (12) ◽  
pp. 5488
Author(s):  
Wei Ping Hsia ◽  
Siu Lun Tse ◽  
Chia Jen Chang ◽  
Yu Len Huang

The purpose of this article is to evaluate the accuracy of the optical coherence tomography (OCT) measurement of choroidal thickness in healthy eyes using a deep-learning method with the Mask R-CNN model. Thirty EDI-OCT of thirty patients were enrolled. A mask region-based convolutional neural network (Mask R-CNN) model composed of deep residual network (ResNet) and feature pyramid networks (FPNs) with standard convolution and fully connected heads for mask and box prediction, respectively, was used to automatically depict the choroid layer. The average choroidal thickness and subfoveal choroidal thickness were measured. The results of this study showed that ResNet 50 layers deep (R50) model and ResNet 101 layers deep (R101). R101 U R50 (OR model) demonstrated the best accuracy with an average error of 4.85 pixels and 4.86 pixels, respectively. The R101 ∩ R50 (AND model) took the least time with an average execution time of 4.6 s. Mask-RCNN models showed a good prediction rate of choroidal layer with accuracy rates of 90% and 89.9% for average choroidal thickness and average subfoveal choroidal thickness, respectively. In conclusion, the deep-learning method using the Mask-RCNN model provides a faster and accurate measurement of choroidal thickness. Comparing with manual delineation, it provides better effectiveness, which is feasible for clinical application and larger scale of research on choroid.


Author(s):  
Hiroaki Iwata ◽  
Tatsuru Matsuo ◽  
Hideaki Mamada ◽  
Takahisa Motomura ◽  
Mayumi Matsushita ◽  
...  

2021 ◽  
Vol 1948 (1) ◽  
pp. 012026
Author(s):  
Huang Lu ◽  
Mao Xiaoyan ◽  
Xing Yan

Sign in / Sign up

Export Citation Format

Share Document