sound classification
Recently Published Documents


TOTAL DOCUMENTS

440
(FIVE YEARS 240)

H-INDEX

23
(FIVE YEARS 8)

Author(s):  
Murugaiya Ramashini ◽  
P. Emeroylariffion Abas ◽  
Kusuma Mohanchandra ◽  
Liyanage C. De Silva

Birds are excellent environmental indicators and may indicate sustainability of the ecosystem; birds may be used to provide provisioning, regulating, and supporting services. Therefore, birdlife conservation-related researches always receive centre stage. Due to the airborne nature of birds and the dense nature of the tropical forest, bird identifications through audio may be a better solution than visual identification. The goal of this study is to find the most appropriate cepstral features that can be used to classify bird sounds more accurately. Fifteen (15) endemic Bornean bird sounds have been selected and segmented using an automated energy-based algorithm. Three (3) types of cepstral features are extracted; linear prediction cepstrum coefficients (LPCC), mel frequency cepstral coefficients (MFCC), gammatone frequency cepstral coefficients (GTCC), and used separately for classification purposes using support vector machine (SVM). Through comparison between their prediction results, it has been demonstrated that model utilising GTCC features, with 93.3% accuracy, outperforms models utilising MFCC and LPCC features. This demonstrates the robustness of GTCC for bird sounds classification. The result is significant for the advancement of bird sound classification research, which has been shown to have many applications such as in eco-tourism and wildlife management.


2022 ◽  
Author(s):  
Mahbubeh Bahreini ◽  
Ramin Barati ◽  
Abbas Kamaly

Abstract Early diagnosis is crucial in the treatment of heart diseases. Researchers have applied a variety of techniques for cardiovascular disease diagnosis, including the detection of heart sounds. It is an efficient and affordable diagnosis technique. Body organs, including the heart, generate several sounds. These sounds are different in different individuals. A number of methodologies have been recently proposed to detect and diagnose normal/abnormal sounds generated by the heart. The present study proposes a technique on the basis of the Mel-frequency cepstral coefficients, fractal dimension, and hidden Markov model. It uses the fractal dimension to identify sounds S1 and S2. Then, the Mel-frequency cepstral coefficients and the first- and second-order difference Mel-frequency cepstral coefficients are employed to extract the features of the signals. The adaptive Hemming window length is a major advantage of the methodology. The S1-S2 interval determines the adaptive length. Heart sounds are divided into normal and abnormal through the improved hidden Markov model and Baum-Welch and Viterbi algorithms. The proposed framework is evaluated using a number of datasets under various scenarios.


2022 ◽  
Vol 185 ◽  
pp. 108437
Author(s):  
Achyut Mani Tripathi ◽  
Aakansha Mishra

2022 ◽  
Vol 188 ◽  
pp. 108589
Author(s):  
Beyda Tasar ◽  
Orhan Yaman ◽  
Turker Tuncer

2021 ◽  
Vol 47 (6) ◽  
pp. 514-528
Author(s):  
Doo-Seo Park ◽  
Min-Young Lee ◽  
Ki-hyun Kim ◽  
Hong-Chul Lee

2021 ◽  
Vol 11 (2) ◽  
pp. 165-174
Author(s):  
Türker TUNCER ◽  
Erhan AKBAL ◽  
Emrah AYDEMİR ◽  
Samir Brahim BELHAOUARI ◽  
Sengul DOGAN

2021 ◽  
Vol 5 (3 (Under Construction)) ◽  
pp. 334-343
Author(s):  
Türker TUNCER ◽  
Emrah AYDEMİR ◽  
Fatih ÖZYURT ◽  
Sengul DOGAN ◽  
Samir Brahim BELHAOUARI ◽  
...  

2021 ◽  
Vol 10 (4) ◽  
pp. 72
Author(s):  
Eleni Tsalera ◽  
Andreas Papadakis ◽  
Maria Samarakou

The paper investigates retraining options and the performance of pre-trained Convolutional Neural Networks (CNNs) for sound classification. CNNs were initially designed for image classification and recognition, and, at a second phase, they extended towards sound classification. Transfer learning is a promising paradigm, retraining already trained networks upon different datasets. We selected three ‘Image’- and two ‘Sound’-trained CNNs, namely, GoogLeNet, SqueezeNet, ShuffleNet, VGGish, and YAMNet, and applied transfer learning. We explored the influence of key retraining parameters, including the optimizer, the mini-batch size, the learning rate, and the number of epochs, on the classification accuracy and the processing time needed in terms of sound preprocessing for the preparation of the scalograms and spectrograms as well as CNN training. The UrbanSound8K, ESC-10, and Air Compressor open sound datasets were employed. Using a two-fold criterion based on classification accuracy and time needed, we selected the ‘champion’ transfer-learning parameter combinations, discussed the consistency of the classification results, and explored possible benefits from fusing the classification estimations. The Sound CNNs achieved better classification accuracy, reaching an average of 96.4% for UrbanSound8K, 91.25% for ESC-10, and 100% for the Air Compressor dataset.


2021 ◽  
Author(s):  
Lei Xu ◽  
Jianhong Cheng ◽  
Jin Liu ◽  
Hulin Kuang ◽  
Fan Wu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document