scholarly journals Source Separation of Convolutive and Noisy Mixtures Using Audio-Visual Dictionary Learning and Probabilistic Time-Frequency Masking

2013 ◽  
Vol 61 (22) ◽  
pp. 5520-5535 ◽  
Author(s):  
Qingju Liu ◽  
Wenwu Wang ◽  
Philip J. B. Jackson ◽  
Mark Barnard ◽  
Josef Kittler ◽  
...  
2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
Gang Yu

In structural dynamic analysis, the blind source separation (BSS) technique has been accepted as one of the most effective ways for modal identification, in which how to extract the modal parameters using very limited sensors is a highly challenging task in this field. In this paper, we first review the drawbacks of the conventional BSS methods and then propose a novel underdetermined BSS method for addressing the modal identification with limited sensors. The proposed method is established on the clustering features of time-frequency (TF) transform of modal response signals. This study finds that the TF energy belonging to different monotone modals can cluster into distinct straight lines. Meanwhile, we provide the detailed theorem to explain the clustering features. Moreover, the TF coefficients of each modal are employed to reconstruct all monotone signals, which can benefit to individually identify the modal parameters. In experimental validations, two experimental validations demonstrate the effectiveness of the proposed method.


2007 ◽  
Vol 187 (1) ◽  
pp. 153-162 ◽  
Author(s):  
Keiko Fujita ◽  
Yoshitsugu Takei ◽  
Akira Morimoto ◽  
Ryuichi Ashino

2021 ◽  
Author(s):  
◽  
Jiawen Chua

<p>In most real-time systems, particularly for applications involving system identification, latency is a critical issue. These applications include, but are not limited to, blind source separation (BSS), beamforming, speech dereverberation, acoustic echo cancellation and channel equalization. The system latency consists of an algorithmic delay and an estimation computational time. The latter can be avoided by using a multi-thread system, which runs the estimation process and the processing procedure simultaneously. The former, which consists of a delay of one window length, is usually unavoidable for the frequency-domain approaches. For frequency-domain approaches, a block of data is acquired by using a window, transformed and processed in the frequency domain, and recovered back to the time domain by using an overlap-add technique.  In the frequency domain, the convolutive model, which is usually used to describe the process of a linear time-invariant (LTI) system, can be represented by a series of multiplicative models to facilitate estimation. To implement frequency-domain approaches in real-time applications, the short-time Fourier transform (STFT) is commonly used. The window used in the STFT must be at least twice the room impulse response which is long, so that the multiplicative model is sufficiently accurate. The delay constraint caused by the associated blockwise processing window length makes most the frequency-domain approaches inapplicable for real-time systems.  This thesis aims to design a BSS system that can be used in a real-time scenario with minimal latency. Existing BSS approaches can be integrated into our system to perform source separation with low delay without affecting the separation performance. The second goal is to design a BSS system that can perform source separation in a non-stationary environment.  We first introduce a subspace approach to directly estimate the separation parameters in the low-frequency-resolution time-frequency (LFRTF) domain. In the LFRTF domain, a shorter window is used to reduce the algorithmic delay of the system during the signal acquisition, e.g., the window length is shorter than the room impulse response. The subspace method facilitates the deconvolution of a convolutive mixture to a new instantaneous mixture and simplifies the estimation process.  Second, we propose an alternative approach to address the algorithmic latency problem. The alternative method enables us to obtain the separation parameters in the LFRTF domain based on parameters estimated in the high-frequency-resolution time-frequency (HFRTF) domain, where the window length is longer than the room impulse response, without affecting the separation performance.  The thesis also provides a solution to address the BSS problem in a non-stationary environment. We utilize the ``meta-information" that is obtained from previous BSS operations to facilitate the separation in the future without performing the entire BSS process again. Repeating a BSS process can be computationally expensive. Most conventional BSS algorithms require sufficient signal samples to perform analysis and this prolongs the estimation delay. By utilizing information from the entire spectrum, our method enables us to update the separation parameters with only a single snapshot of observation data. Hence, our method minimizes the estimation period, reduces the redundancy and improves the efficacy of the system.  The final contribution of the thesis is a non-iterative method for impulse response shortening. This method allows us to use a shorter representation to approximate the long impulse response. It further improves the computational efficiency of the algorithm and yet achieves satisfactory performance.</p>


Sign in / Sign up

Export Citation Format

Share Document