CNN-Based Acoustic Scene Classification System

Yerin Lee; Soyoung Lim; Il-Youp Kwak

doi:10.3390/electronics10040371

CNN-Based Acoustic Scene Classification System

Electronics ◽

10.3390/electronics10040371 ◽

2021 ◽

Vol 10 (4) ◽

pp. 371

Author(s):

Yerin Lee ◽

Soyoung Lim ◽

Il-Youp Kwak

Keyword(s):

Low Complexity ◽

Classification Model ◽

Weight Class ◽

Scene Classification ◽

Audio File ◽

General Classification ◽

Team Task ◽

Real World Applications ◽

Activation Mapping

Acoustic scene classification (ASC) categorizes an audio file based on the environment in which it has been recorded. This has long been studied in the detection and classification of acoustic scenes and events (DCASE). This presents the solution to Task 1 of the DCASE 2020 challenge submitted by the Chung-Ang University team. Task 1 addressed two challenges that ASC faces in real-world applications. One is that the audio recorded using different recording devices should be classified in general, and the other is that the model used should have low-complexity. We proposed two models to overcome the aforementioned problems. First, a more general classification model was proposed by combining the harmonic-percussive source separation (HPSS) and deltas-deltadeltas features with four different models. Second, using the same feature, depthwise separable convolution was applied to the Convolutional layer to develop a low-complexity model. Moreover, using gradient-weight class activation mapping (Grad-CAM), we investigated what part of the feature our model sees and identifies. Our proposed system ranked 9th and 7th in the competition for these two subtasks, respectively.

Download Full-text

A Low-Compexity Deep Learning FrameworkFor Acoustic Scene Classification

10.31219/osf.io/cmgws ◽

2021 ◽

Author(s):

Lam Pham ◽

Hieu Tang ◽

Anahid Jalal ◽

Alexander Schindler ◽

Ross King

Keyword(s):

Audio Signal ◽

Low Complexity ◽

Scene Classification ◽

Late Fusion ◽

Classification Result ◽

Urban Scenes ◽

Model Compression ◽

Front End ◽

Learning Frameworks

In this paper, we presents a low-complexitydeep learning frameworks for acoustic scene classification(ASC). The proposed framework can be separated into threemain steps: Front-end spectrogram extraction, back-endclassification, and late fusion of predicted probabilities.First, we use Mel filter, Gammatone filter and ConstantQ Transfrom (CQT) to transform raw audio signal intospectrograms, where both frequency and temporal featuresare presented. Three spectrograms are then fed into threeindividual back-end convolutional neural networks (CNNs),classifying into ten urban scenes. Finally, a late fusion ofthree predicted probabilities obtained from three CNNs isconducted to achieve the final classification result. To reducethe complexity of our proposed CNN network, we applytwo model compression techniques: model restriction anddecomposed convolution. Our extensive experiments, whichare conducted on DCASE 2021 (IEEE AASP Challenge onDetection and Classification of Acoustic Scenes and Events)Task 1A development dataset, achieve a low-complexity CNNbased framework with 128 KB trainable parameters andthe best classification accuracy of 66.7%, improving DCASEbaseline by 19.0%.

Download Full-text

FUNDAMENTAL NATURE AND CLASSIFICATION OF LABOR MIGRATION

Innovation in the economy ◽

10.26739/2181-9491-2020-1-2 ◽

2020 ◽

Vol 1 (3) ◽

pp. 14-23

Author(s):

Tulkin Chulliev ◽

Keyword(s):

Labor Migration ◽

General Classification ◽

Fundamental Nature

The article explains the fundamental nature of migration by combining the definitions given by other scholars. The issue of labor migration is analyzed. One of the most important problems in contemporary migration processes - the problem of classification- is researched and a general classification is provided

Download Full-text

Connectionist Temporal Classification Model for Dynamic Hand Gesture Recognition using RGB and Optical flow Data

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/4/8 ◽

2020 ◽

Vol 17 (4) ◽

pp. 497-506

Author(s):

Sunil Patel ◽

Ramji Makwana

Keyword(s):

Neural Network ◽

Optical Flow ◽

Gesture Recognition ◽

Hand Gesture Recognition ◽

Classification Model ◽

Hand Gesture ◽

Flow Data ◽

Dynamic Hand Gesture Recognition ◽

Connectionist Temporal Classification

Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%

Download Full-text

Scene classification of ambiguous visual information

5th International Conference on Visual Information Engineering (VIE 2008) ◽

10.1049/cp:20080403 ◽

2008 ◽

Author(s):

L. Dong ◽

E. Izquierdo

Keyword(s):

Visual Information ◽

Scene Classification

Download Full-text

Growth Stages Classification of Potato Crop Based on Analysis of Spectral Response and Variables Optimization

Sensors ◽

10.3390/s20143995 ◽

2020 ◽

Vol 20 (14) ◽

pp. 3995 ◽

Cited By ~ 3

Author(s):

Ning Liu ◽

Ruomei Zhao ◽

Lang Qiao ◽

Yao Zhang ◽

Minzan Li ◽

...

Keyword(s):

Growth Stage ◽

Spectral Response ◽

Classification Model ◽

Growth Stages ◽

Potato Field ◽

Spectral Bands ◽

Svm Model ◽

Selection Algorithms ◽

Potato Crops

Potato is the world’s fourth-largest food crop, following rice, wheat, and maize. Unlike other crops, it is a typical root crop with a special growth cycle pattern and underground tubers, which makes it harder to track the progress of potatoes and to provide automated crop management. The classification of growth stages has great significance for right time management in the potato field. This paper aims to study how to classify the growth stage of potato crops accurately on the basis of spectroscopy technology. To develop a classification model that monitors the growth stage of potato crops, the field experiments were conducted at the tillering stage (S1), tuber formation stage (S2), tuber bulking stage (S3), and tuber maturation stage (S4), respectively. After spectral data pre-processing, the dynamic changes in chlorophyll content and spectral response during growth were analyzed. A classification model was then established using the support vector machine (SVM) algorithm based on spectral bands and the wavelet coefficients obtained from the continuous wavelet transform (CWT) of reflectance spectra. The spectral variables, which include sensitive spectral bands and feature wavelet coefficients, were optimized using three selection algorithms to improve the classification performance of the model. The selection algorithms include correlation analysis (CA), the successive projection algorithm (SPA), and the random frog (RF) algorithm. The model results were used to compare the performance of various methods. The CWT-SPA-SVM model exhibited excellent performance. The classification accuracies on the training set (Atrain) and the test set (Atest) were respectively 100% and 97.37%, demonstrating the good classification capability of the model. The difference between the Atrain and accuracy of cross-validation (Acv) was 1%, which showed that the model has good stability. Therefore, the CWT-SPA-SVM model can be used to classify the growth stages of potato crops accurately. This study provides an important support method for the classification of growth stages in the potato field.

Download Full-text

Uncertainty-Aware Deep Learning-Based Cardiac Arrhythmias Classification Model of Electrocardiogram Signals

Computers ◽

10.3390/computers10060082 ◽

2021 ◽

Vol 10 (6) ◽

pp. 82

Author(s):

Ahmad O. Aseeri

Keyword(s):

Deep Learning ◽

Cardiac Arrhythmias ◽

Large Scale ◽

Clinical Decision Making ◽

Probabilistic Approach ◽

Classification Model ◽

Gating Mechanism ◽

Uncertainty Estimates ◽

Wide Range

Deep Learning-based methods have emerged to be one of the most effective and practical solutions in a wide range of medical problems, including the diagnosis of cardiac arrhythmias. A critical step to a precocious diagnosis in many heart dysfunctions diseases starts with the accurate detection and classification of cardiac arrhythmias, which can be achieved via electrocardiograms (ECGs). Motivated by the desire to enhance conventional clinical methods in diagnosing cardiac arrhythmias, we introduce an uncertainty-aware deep learning-based predictive model design for accurate large-scale classification of cardiac arrhythmias successfully trained and evaluated using three benchmark medical datasets. In addition, considering that the quantification of uncertainty estimates is vital for clinical decision-making, our method incorporates a probabilistic approach to capture the model’s uncertainty using a Bayesian-based approximation method without introducing additional parameters or significant changes to the network’s architecture. Although many arrhythmias classification solutions with various ECG feature engineering techniques have been reported in the literature, the introduced AI-based probabilistic-enabled method in this paper outperforms the results of existing methods in outstanding multiclass classification results that manifest F1 scores of 98.62% and 96.73% with (MIT-BIH) dataset of 20 annotations, and 99.23% and 96.94% with (INCART) dataset of eight annotations, and 97.25% and 96.73% with (BIDMC) dataset of six annotations, for the deep ensemble and probabilistic mode, respectively. We demonstrate our method’s high-performing and statistical reliability results in numerical experiments on the language modeling using the gating mechanism of Recurrent Neural Networks.

Download Full-text

Distinguishing Planting Structures of Different Complexity from UAV Multispectral Images

Sensors ◽

10.3390/s21061994 ◽

2021 ◽

Vol 21 (6) ◽

pp. 1994

Author(s):

Qian Ma ◽

Wenting Han ◽

Shenjin Huang ◽

Shide Dong ◽

Guang Li ◽

...

Keyword(s):

Remote Sensing ◽

Confusion Matrix ◽

Object Oriented ◽

Low Complexity ◽

Classification Model ◽

Recursive Feature Elimination ◽

Support Vector ◽

Multispectral Images ◽

High Complexity ◽

Structure Complexity

This study explores the classification potential of a multispectral classification model for farmland with planting structures of different complexity. Unmanned aerial vehicle (UAV) remote sensing technology is used to obtain multispectral images of three study areas with low-, medium-, and high-complexity planting structures, containing three, five, and eight types of crops, respectively. The feature subsets of three study areas are selected by recursive feature elimination (RFE). Object-oriented random forest (OB-RF) and object-oriented support vector machine (OB-SVM) classification models are established for the three study areas. After training the models with the feature subsets, the classification results are evaluated using a confusion matrix. The OB-RF and OB-SVM models’ classification accuracies are 97.09% and 99.13%, respectively, for the low-complexity planting structure. The equivalent values are 92.61% and 99.08% for the medium-complexity planting structure and 88.99% and 97.21% for the high-complexity planting structure. For farmland with fragmentary plots and a high-complexity planting structure, as the planting structure complexity changed from low to high, both models’ overall accuracy levels decreased. The overall accuracy of the OB-RF model decreased by 8.1%, and that of the OB-SVM model only decreased by 1.92%. OB-SVM achieves an overall classification accuracy of 97.21%, and a single-crop extraction accuracy of at least 85.65%. Therefore, UAV multispectral remote sensing can be used for classification applications in highly complex planting structures.

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Classification method of marine target motion pattern based on spatial-temporal trajectories

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-215383 ◽

2021 ◽

pp. 1-15

Author(s):

Baichen Jiang ◽

Wei Zhou ◽

Jian Guan ◽

Jialong Jin

Keyword(s):

Pattern Classification ◽

Training Sample ◽

Classification Algorithm ◽

Classification Model ◽

Motion Pattern ◽

Motion Patterns ◽

Lp Norm ◽

Sparse Representation Classification ◽

Route Safety

Classifying the motion pattern of marine targets is of important significance to promote target surveillance and management efficiency of marine area and to guarantee sea route safety. This paper proposes a moving target classification algorithm model based on channel extraction-segmentation-LCSCA-lp norm minimization. The algorithm firstly analyzes the entire distribution of channels in specific region, and defines the categories of potential ship motion patterns; on this basis, through secondary segmentation processing method, it obtains several line segment trajectories as training sample sets, to improve the accuracy of classification algorithm; then, it further uses the Leastsquares Cubic Spline Curves Approximation (LCSCA) technology to represent the training sample sets, and builds a motion pattern classification sample dictionary; finally, it uses lp norm minimized sparse representation classification model to realize the classification of motion patterns. The verification experiment based on real spatial-temporal trajectory dataset indicates that, this method can effectively realize the motion pattern classification of marine targets, and shows better time performance and classification accuracy than other representative classification methods.

Download Full-text

Low-Complexity Acoustic Scene Classification Using Data Generation Based On Primary Ambient Extraction

10.1109/bmsb53066.2021.9547178 ◽

2021 ◽

Author(s):

Chuang Shi ◽

Haocong Yang ◽

Yingzi Liu ◽

Jiangnan Liang

Keyword(s):

Low Complexity ◽

Data Generation ◽

Scene Classification ◽

Using Data

Download Full-text