speaker diarization Latest Research Papers

Metaheuristic adapted convolutional neural network for Telugu speaker diarization

Intelligent Decision Technologies ◽

10.3233/idt-211005 ◽

2021 ◽

pp. 1-17

Author(s):

Sethuram V ◽

Ande Prasad ◽

R. Rajeswara Rao

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Research Work ◽

Activation Function ◽

Speaker Diarization ◽

Activity Detection ◽

Speech Technology ◽

Speaker Segmentation ◽

Speech Activity ◽

Speech Activity Detection

In speech technology, a pivotal role is being played by the Speaker diarization mechanism. In general, speaker diarization is the mechanism of partitioning the input audio stream into homogeneous segments based on the identity of the speakers. The automatic transcription readability can be improved with the speaker diarization as it is good in recognizing the audio stream into the speaker turn and often provides the true speaker identity. In this research work, a novel speaker diarization approach is introduced under three major phases: Feature Extraction, Speech Activity Detection (SAD), and Speaker Segmentation and Clustering process. Initially, from the input audio stream (Telugu language) collected, the Mel Frequency Cepstral coefficient (MFCC) based features are extracted. Subsequently, in Speech Activity Detection (SAD), the music and silence signals are removed. Then, the acquired speech signals are segmented for each individual speaker. Finally, the segmented signals are subjected to the speaker clustering process, where the Optimized Convolutional Neural Network (CNN) is used. To make the clustering more appropriate, the weight and activation function of CNN are fine-tuned by a new Self Adaptive Sea Lion Algorithm (SA-SLnO). Finally, a comparative analysis is made to exhibit the superiority of the proposed speaker diarization work. Accordingly, the accuracy of the proposed method is 0.8073, which is 5.255, 2.45%, and 0.075, superior to the existing works.

Download Full-text

A review of speaker diarization: Recent advances with deep learning

Computer Speech & Language ◽

10.1016/j.csl.2021.101317 ◽

2021 ◽

pp. 101317

Author(s):

Tae Jin Park ◽

Naoyuki Kanda ◽

Dimitrios Dimitriadis ◽

Kyu J. Han ◽

Shinji Watanabe ◽

...

Keyword(s):

Deep Learning ◽

Speaker Diarization ◽

Recent Advances

Download Full-text

Joint speaker diarization and speech recognition based on region proposal networks

Computer Speech & Language ◽

10.1016/j.csl.2021.101316 ◽

2021 ◽

pp. 101316

Author(s):

Zili Huang ◽

Marc Delcroix ◽

Leibny Paola Garcia ◽

Shinji Watanabe ◽

Desh Raj ◽

...

Keyword(s):

Speech Recognition ◽

Speaker Diarization

Download Full-text

End-To-End Neural Speaker Diarization Through Step-Function

10.1109/aict52784.2021.9620513 ◽

2021 ◽

Author(s):

Rustam Latypov ◽

Evgeni Stolov

Keyword(s):

Step Function ◽

Speaker Diarization ◽

End To End

Download Full-text

Robust End-to-End Speaker Diarization with Conformer and Additive Margin Penalty

10.21437/interspeech.2021-1377 ◽

2021 ◽

Author(s):

Tsun-Yat Leung ◽

Lahiru Samarakoon

Keyword(s):

Speaker Diarization ◽

End To End

Download Full-text

Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings

10.21437/interspeech.2021-1807 ◽

2021 ◽

Author(s):

Kiran Karra ◽

Alan McCree

Keyword(s):

Speaker Diarization ◽

Leave One Out

Download Full-text

ECAPA-TDNN Embeddings for Speaker Diarization

10.21437/interspeech.2021-941 ◽

2021 ◽

Author(s):

Nauman Dawalatabad ◽

Mirco Ravanelli ◽

François Grondin ◽

Jenthe Thienpondt ◽

Brecht Desplanques ◽

...

Keyword(s):

Speaker Diarization

Download Full-text

Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference

10.21437/interspeech.2021-261 ◽

2021 ◽

Author(s):

Xucheng Wan ◽

Kai Liu ◽

Huan Zhou

Keyword(s):

Speaker Diarization

Download Full-text

Scenario-Dependent Speaker Diarization for DIHARD-III Challenge

10.21437/interspeech.2021-516 ◽

2021 ◽

Author(s):

Yu-Xuan Wang ◽

Jun Du ◽

Maokui He ◽

Shu-Tong Niu ◽

Lei Sun ◽

...

Keyword(s):

Speaker Diarization

Download Full-text

Chronological Self-Training for Real-Time Speaker Diarization

10.21437/interspeech.2021-822 ◽

2021 ◽

Author(s):

Dirk Padfield ◽

Daniel J. Liebling

Keyword(s):

Real Time ◽

Speaker Diarization

Download Full-text

speaker diarization
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Metaheuristic adapted convolutional neural network for Telugu speaker diarization

A review of speaker diarization: Recent advances with deep learning

Joint speaker diarization and speech recognition based on region proposal networks

End-To-End Neural Speaker Diarization Through Step-Function

Robust End-to-End Speaker Diarization with Conformer and Additive Margin Penalty

Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings

ECAPA-TDNN Embeddings for Speaker Diarization

Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference

Scenario-Dependent Speaker Diarization for DIHARD-III Challenge

Chronological Self-Training for Real-Time Speaker Diarization

Export Citation Format

speaker diarizationRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Metaheuristic adapted convolutional neural network for Telugu speaker diarization

A review of speaker diarization: Recent advances with deep learning

Joint speaker diarization and speech recognition based on region proposal networks

End-To-End Neural Speaker Diarization Through Step-Function

Robust End-to-End Speaker Diarization with Conformer and Additive Margin Penalty

Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings

ECAPA-TDNN Embeddings for Speaker Diarization

Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference

Scenario-Dependent Speaker Diarization for DIHARD-III Challenge

Chronological Self-Training for Real-Time Speaker Diarization

speaker diarization
Recently Published Documents