audio signals
Recently Published Documents


TOTAL DOCUMENTS

1075
(FIVE YEARS 238)

H-INDEX

29
(FIVE YEARS 4)

2022 ◽  
pp. 125-142
Author(s):  
Vijay Srinivas Srinivas Tida ◽  
Raghabendra Shah ◽  
Xiali Hei

The laser-based audio signal injection can be used for attacking voice controllable systems. An attacker can aim an amplitude-modulated light at the microphone's aperture, and the signal injection acts as a remote voice-command attack on voice-controllable systems. Attackers are using vulnerabilities to steal things that are in the form of physical devices or the form of virtual using making orders, withdrawal of money, etc. Therefore, detection of these signals is important because almost every device can be attacked using these amplitude-modulated laser signals. In this project, the authors use deep learning to detect the incoming signals as normal voice commands or laser-based audio signals. Mel frequency cepstral coefficients (MFCC) are derived from the audio signals to classify the input audio signals. If the audio signals are identified as laser signals, the voice command can be disabled, and an alert can be displayed to the victim. The maximum accuracy of the machine learning model was 100%, and in the real world, it's around 95%.


Author(s):  
Osman Balli ◽  
Yakup Kutlu

One of the most important signals in the field of biomedicine is audio signals. Sound signals obtained from the body give us information about the general condition of the body. However, the detection of different sounds when recording audio signals belonging to the body or listening to them by doctors makes it difficult to diagnose the disease from these signals. In addition to isolating these sounds from the external environment, it is also necessary to separate their sounds from different parts of the body during the analysis. Separation of heart, lung and abdominal sounds will facilitate digital analysis, in particular. In this study, a dataset was created from the lungs, heart and abdominal sounds. MFCC (Mel Frekans Cepstrum Coefficient) coefficient data were obtained. The obtained coefficients were trained in the CNN (Convolution Neural Network) model. The purpose of this study is to classify audio signals. With this classification, a control system can be created. In this way, erroneous recordings that may occur when recording physicians' body voices will be prevented. When looking at the results, the educational success is about 98% and the test success is about 85%.


Symmetry ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 17
Author(s):  
Wanying Dai ◽  
Xiangliang Xu ◽  
Xiaoming Song ◽  
Guodong Li

The data space for audio signals is large, the correlation is strong, and the traditional encryption algorithm cannot meet the needs of efficiency and safety. To solve this problem, an audio encryption algorithm based on Chen memristor chaotic system is proposed. The core idea of the algorithm is to encrypt the audio signal into the color image information. Most of the traditional audio encryption algorithms are transmitted in the form of noise, which makes it easy to attract the attention of attackers. In this paper, a special encryption method is used to obtain higher security. Firstly, the Fast Walsh–Hadamar Transform (FWHT) is used to compress and denoise the signal. Different from the Fast Fourier Transform (FFT) and the Discrete Cosine Transform (DCT), FWHT has good energy compression characteristics. In addition, compared with that of the triangular basis function of the Fast Fourier Transform, the rectangular basis function of the FWHT can be more effectively implemented in the digital circuit to transform the reconstructed dual-channel audio signal into the R and B layers of the digital image matrix, respectively. Furthermore, a new Chen memristor chaotic system solves the periodic window problems, such as the limited chaos range and nonuniform distribution. It can generate a mask block with high complexity and fill it into the G layer of the color image matrix to obtain a color audio image. In the next place, combining plaintext information with color audio images, interactive channel shuffling can not only weaken the correlation between adjacent samples, but also effectively resist selective plaintext attacks. Finally, the cryptographic block is used for overlapping diffusion encryption to fill the silence period of the speech signal, so as to obtain the ciphertext audio. Experimental results and comparative analysis show that the algorithm is suitable for different types of audio signals, and can resist many common cryptographic analysis attacks. Compared with that of similar audio encryption algorithms, the security index of the algorithm is better, and the efficiency of the algorithm is greatly improved.


2021 ◽  
Author(s):  
Alexandra Hamlin ◽  
Erik Kobylarz ◽  
James Lever ◽  
Susan Taylor ◽  
Laura Ray

This paper investigates the feasibility of using non-cerebral, time-series data to detect epileptic seizures. Data were recorded from fifteen patients (7 male, 5 female, 3 not noted, mean age 36.17 yrs), five of whom had a total of seven seizures. Patients were monitored in an inpatient setting using standard video electroencephalography (vEEG), while also wearing sensors monitoring electrocardiography, electrodermal activity, electromyography, accelerometry, and audio signals (vocalizations). A systematic and detailed study was conducted to identify the sensors and the features derived from the non-cerebral sensors that contribute most significantly to separability of data acquired during seizures from non-seizure data. Post-processing of the data using linear discriminant analysis (LDA) shows that seizure data are strongly separable from non-seizure data based on features derived from the signals recorded. The mean area under the receiver operator characteristic (ROC) curve for each individual patient that experienced a seizure during data collection, calculated using LDA, was 0.9682. The features that contribute most significantly to seizure detection differ for each patient. The results show that a multimodal approach to seizure detection using the specified sensor suite is promising in detecting seizures with both sensitivity and specificity. Moreover, the study provides a means to quantify the contribution of each sensor and feature to separability. Development of a non-electroencephalography (EEG) based seizure detection device would give doctors a more accurate seizure count outside of the clinical setting, improving treatment and the quality of life of epilepsy patients.


2021 ◽  
Author(s):  
A. Yagodkin ◽  
V. Tuinov ◽  
V. Lavlinskiy ◽  
Yu. Tabakov

The article presents the results of a study on the possibility of using the Daubechy wavelet transform for processing low-frequency signals taken from the cerebral cortex. The advantages of using the Daubechy wavelet transform for audio signals with natural interference are given.


Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1613
Author(s):  
Daniel Guerrero ◽  
Pedro Rivera ◽  
Gerardo Febres ◽  
Carlos Gershenson

The accurate description of a complex process should take into account not only the interacting elements involved but also the scale of the description. Therefore, there can not be a single measure for describing the associated complexity of a process nor a single metric applicable in all scenarios. This article introduces a framework based on multiscale entropy to characterize the complexity associated with the most identifiable characteristic of songs: the melody. We are particularly interested in measuring the complexity of popular songs and identifying levels of complexity that statistically explain the listeners’ preferences. We analyze the relationship between complexity and popularity using a database of popular songs and their relative position in a preferences ranking. There is a tendency toward a positive association between complexity and acceptance (success) of a song that is, however, not significant after adjusting for multiple testing.


2021 ◽  
Author(s):  
Alef Iury S. Ferreira ◽  
Frederico S. Oliveira ◽  
Nádia F. Felipe da Silva ◽  
Anderson S. Soares

O reconhecimento de gênero a partir da fala é um problema relacionado à análise de fala humana, e possui diversas aplicações que vão desde a personalização na recomendação de produtos à ciência forense. A identificação da eficiência e custos de diferentes abordagens que lidam com esse problema é imprescindível. Este trabalho tem como foco investigar e comparar a eficiência e custos de diferentes arquiteturas de deep learning para o reconhecimento de gênero a partir da fala. Os resultados mostram que o modelo convolucional unidimensional consegue os melhores resultados. No entanto, constatou-se que o modelo fully connected apresentou resultados próximos com menor custo, tanto no uso de memória, quanto no tempo de treinamento.


2021 ◽  
Author(s):  
V. Antsiferova ◽  
T. Pesetskaya ◽  
I. Yuldoshev ◽  
Syanyan Lu ◽  
Cin Van ◽  
...  

This article describes one of the approaches for determining the relevant characteristics of audio signals on the example of interjection studies. The comparative evaluation is based on the technical characteristics of the audio signals recorded on the dictaphone of interjections, which are, first, the same in semantic analysis, and second, close in sound to each other, but pronounced with different intonation.


2021 ◽  
Author(s):  
V. Antsiferova ◽  
T. Pesetskaya ◽  
I. Yuldoshev ◽  
Syanyan Lu ◽  
Cin Van ◽  
...  

This article describes a method for analyzing the relevant characteristics of audio signals of in-terjections on the example of the Russian language. Audio signals interjections WOW! recorded on a dictaphone and pronounced with different intonation. Interjections are recorded as WAV files for the Russian language using the Praat software product.


Sign in / Sign up

Export Citation Format

Share Document