voice signal
Recently Published Documents


TOTAL DOCUMENTS

279
(FIVE YEARS 86)

H-INDEX

14
(FIVE YEARS 2)

2022 ◽  
Vol 5 (1) ◽  
pp. 013001
Author(s):  
Shenghan Gao ◽  
Changyan Zheng ◽  
Yicong Zhao ◽  
Ziyue Wu ◽  
Jiao Li ◽  
...  

2022 ◽  
Vol 32 (2) ◽  
pp. 705-722
Author(s):  
Imran Ahmed ◽  
Sultan Aljahdali ◽  
Muhammad Shakeel Khan ◽  
Sanaa Kaddoura

Author(s):  
Yanjiao Chen ◽  
Meng Xue ◽  
Jian Zhang ◽  
Qianyun Guan ◽  
Zhiyuan Wang ◽  
...  

Voice-based authentication is prevalent on smart devices to verify the legitimacy of users, but is vulnerable to replay attacks. In this paper, we propose to leverage the distinctive chest motions during speaking to establish a secure multi-factor authentication system, named ChestLive. Compared with other biometric-based authentication systems, ChestLive does not require users to remember any complicated information (e.g., hand gestures, doodles) and the working distance is much longer (30cm). We use acoustic sensing to monitor chest motions with a built-in speaker and microphone on smartphones. To obtain fine-grained chest motion signals during speaking for reliable user authentication, we derive Channel Energy (CE) of acoustic signals to capture the chest movement, and then remove the static and non-static interference from the aggregated CE signals. Representative features are extracted from the correlation between voice signal and corresponding chest motion signal. Unlike learning-based image or speech recognition models with millions of available training samples, our system needs to deal with a limited number of samples from legitimate users during enrollment. To address this problem, we resort to meta-learning, which initializes a general model with good generalization property that can be quickly fine-tuned to identify a new user. We implement ChestLive as an application and evaluate its performance in the wild with 61 volunteers using their smartphones. Experiment results show that ChestLive achieves an authentication accuracy of 98.31% and less than 2% of false accept rate against replay attacks and impersonation attacks. We also validate that ChestLive is robust to various factors, including training set size, distance, angle, posture, phone models, and environment noises.


Author(s):  
Carmen Lucia Pezzette Loro ◽  
Eriberto Oliveira do Nascimento ◽  
Paulo Henrique Trombetta Zannin

As diverse as the knowledge transmission techniques aided by multimedia resources are, nothing replaces the teacher-student relationship, which is developed in the classrooms. Therefore, classrooms must offer the necessary conditions for the satisfactory development of teaching and learning activities. In this context, the importance of classroom acoustics is highlighted. The Speech Transmission Index (STI) is one of the broadest and most important parameters to measure speech intelligibility. STI weighs the effects that can cause deterioration on the voice signal, such as the reverberation time and background noise. This work presents an evaluation of the acoustic performance of classrooms using the STI. For that, three constructive patterns were evaluated. The constructive models were named 010, 022 and 023. Students and teachers answered a questionnaire about the perception of noise in the classroom and at school, and the constructive pattern 023 was studied. Computer simulations were performed with ODEON software to precited STI. The results of the measurements and the questionnaires revealed that the noise that disturbs the activities in the classroom comes from the school itself, not only from the other classrooms, corridors and adjacent patios, but also from inside the classroom itself.


2021 ◽  
Vol 1 (1) ◽  
pp. 83-93
Author(s):  
Noor N. Edan ◽  
Nasser N. Khamiss

In mobile communication systems bit-rate reductions while maintaining an acceptable voice quality are necessary to achieve efficiency in channel bandwidth utilization and users satisfaction. As Long-Term Evolution(LTE) converging towards all-IP solutions and supporting VOIP service, the voice signals are converted into coded digital bit-stream and sent over the network. This paper proposes the implementation of codebook excited linear prediction (CELP) voice codec algorithm based on two source-rates of low 9.6Kbps and medium 16Kbps for achieving a perceptible level of voice quality, while efficiently using available bandwidth during the transmission over advanced LTE. The architecture of proposed CELP codec model is implemented to decompose the voice signal into a set of parameters that characterize each particular frame at the encoder part, these parameters are quantized and encoded for transmission to the decoder. The investigation showed that the configuration of the link and the applied CELP codec mode mainly influence on the obtained voice capacity and quality. The quantifying also shows that the voice quality can be traded for the enhanced capacity, since the low rate codec will produce lower voice quality than higher rate codec. Also, this paper is achieved, during theconfiguration of the system with higher channel quality indicator (CQI) index, increasing in the capacity gain to a saturated value of about 500 and 1000 users per cell over 5MHz bandwidth for transmit diversity (TD) and Open-Loop Spatial Multiplexing (OLSM) respectively and up to 1000 and 2000 users per cell over 10MHz channel bandwidth for TD and OLSM respectively.


Author(s):  
JinBeom Shin ◽  
KilSeok Cho ◽  
DongGwan Lee ◽  
TaeHyon Kim

In this paper, we proposed a protocol to convert 2W telephone analog signals to Ethernet data in a private PSTN 2W tactical voice system. There are several kinds of operational problems in the tactical telephone network where 2W telephone copper lines are installed hundreds of meters away from the PBX in a headquarter site. The reason is that it is difficult to install and maintain the 2W telephone copper cable in severe operational fields and to meet safety and stability operational requirements of the telephone line under lighting and electromagnetic environments. In order to solve these challenging demands, we proposed an efficient method that the 2W analog interface signals between a private PBX system and a 2W telephone is converted to Ethernet messages using the optical Ethernet data communication network already deployed in the tactical weapon system. Thus, it is not necessary to install an additional optic cable for the ethernet telephone line and to maintain the private PSTN 2W telephone network. Also it provides safe and secure telecommunication operation under lightning and electromagnetic environments. This paper presents the conversion protocol from 2W telephone signals over Ethernet interface between PBX systems and 2W telephones, the mutual exchange protocol of ethernet messages between two converters, and the rule to process analog signal interface. Finally, we demonstrate that the proposed technique can provide a feasible solution in the tactical weapon system by analyzing its performance and experimental results such as the bandwidth of 2W telephone ethernet network and the transmission latency of voice signal, and the stability of optic ethernet voice network along with the ethernet data network.


2021 ◽  
Vol 2132 (1) ◽  
pp. 012029
Author(s):  
Wenxin Wang ◽  
Ranran Zhu ◽  
Hongliang Zhao

Abstract Speech transmission index (STI for short) is an important index to evaluate the quality of speech transmission of the room, it can better reflect the degree of voice signal affected by room reverberation and noise in the transmission process.This paper presents an algorithm for directly measuring STI index, white noise is filtered by Paul Kellet filter to generate pink noise, the signal envelope is extracted by wavelet transform, which improves the extraction accuracy of signal envelope and makes the measurement of STI index more accurate.


Electronics ◽  
2021 ◽  
Vol 10 (23) ◽  
pp. 2902
Author(s):  
Changchun Cai ◽  
Enjian Bai ◽  
Xue-Qin Jiang ◽  
Yun Wu

With the explosive growth of voice information interaction, there is an urgent need for safe and effective compression transmission methods. In this paper, compressive sensing is used to realize the compression and encryption of speech signals. Firstly, the scheme of linear feedback shift register combined with inner product to generate measurement matrix is proposed. Secondly, we adopt a new parallel compressive sensing technique to tremendously improve the processing efficiency. Further, the two parties in the communication adopt public key cryptosystem to safely share the key and select a different measurement matrix for each frame of the voice signal to ensure the security. This scheme greatly reduces the difficulty of generating measurement matrix in hardware and improves the processing efficiency. Compared with the existing scheme by Moreno-Alvarado et al., our scheme has reduced the execution time by approximately 8%, and the mean square error (MSE) has also been reduced by approximately 5%.


Fluids ◽  
2021 ◽  
Vol 6 (11) ◽  
pp. 412
Author(s):  
Michael Krane

In this paper, the timing of vortex formation on the glottal jet is studied using previously published velocity measurements of flow through a scaled-up model of the human vocal folds. The relative timing of the pulsatile glottal jet and the instability vortices are acoustically important since they determine the harmonic and broadband content of the voice signal. Glottis exit jet velocity time series were extracted from time-resolved planar DPIV measurements. These measurements were acquired at four glottal flow speeds (uSS = 16.1–38 cm/s) and four glottis open times (To = 5.67–23.7 s), providing a Reynolds number range Re = 4100–9700 and reduced vibration frequency f* = 0.01−0.06. Exit velocity waveforms showed temporal behavior on two time scales, one that correlates to the period of vibration and another characterized by short, sharp velocity peaks (which correlate to the passage of instability vortices through the glottis exit plane). The vortex formation time, estimated by computing the time difference between subsequent peaks, was shown to be not well-correlated from one vibration cycle to the next. The principal finding is that vortex formation time depends not only on cycle phase, but varies strongly with reduced frequency of vibration. In all cases, a strong high-frequency burst of vortex motion occurs near the end of the cycle, consistent with perceptual studies using synthesized speech.


2021 ◽  
Vol 67 (6) ◽  
pp. 46-51
Author(s):  
P.M. Kovalchuk ◽  
◽  
T.A. Shydlovska ◽  

We aimed to analyse voice signals in 40 patients with chronic laryngitis elicited by exposure to chemical factors. We ex- amined 20 people with catarrhal chronic laryngitis (group 1), 20 people with subatrophic chronic laryngitis (group 2) and 15 healthy volunteers as controls. All subjects underwent acoustic examination of the voice signal using the software Praat V 4.2.1. We studied acoustic measures as follows: Jitter, Shimmer and NHR (noise-to-harmonics ratio). The analysis of the obtained data revealed statistically significant differ- ences in the average values of Jitter and Shimmer measures, as well as in the ratio of nonharmonic (noise) and harmonic component in the spectrum ( NHR) in patients with chronic laryngitis (groups 1 and 2) compared with controls. In group 1 (chronic catarrhal laryngitis), the average values of acoustic measures such as Jitter, Shimmer and NHR were as follows: Jitter - 0.92 ± 0.1%, Shimmer - 5.31 ± 0.5%, NHR - 0.078 ± 0.04. In group 2 (subatorophic laryngitis), the average values of acoustic measures were: Jitter - 0.67 ± 0.6%, Shimmer - 6.57 ± 0.7% and NHR - 0.028 ± 0.003. The obtained data indicate a pronounced instability of the voice in frequency and amplitude, a significant proportion of the noise component in the spectrum of the voice signal in the examined patients with chronic laryngitis exposed to chemical factors. The most pronounced alterations were found in patients with catarrhal chronic laryngitis. We conclude that the quantitative values of spectral analysis of the voice signal Jitter, Shimmer, NHR may serve as valuable criteria of the degree of voice impair- ment. This may be helpful in determining the effectiveness of rehabilitation measures.


Sign in / Sign up

Export Citation Format

Share Document