A Review on Speaker Diarization for Whispered Speech Audio
DOI:
https://doi.org/10.47392/IRJAEM.2025.0279Keywords:
Speaker diarization, feature extraction, Voice activity detection, Deep neural network, Speaker clustering, Diarization Error RateAbstract
Speaker diarization, the process of partitioning an audio stream into segments according to the speaker identity, is crucial for various applications in speech processing and analysis. Whispered speech, characterized by its low amplitude and altered spectral properties, presents unique challenges for conventional diarization algorithms designed for clear, normal speech. In this study, I propose a novel approach for supervised speaker diarization specifically tailored to whispered speech audio streams. Supervised learning techniques, utilizing annotated data to train models capable of accurately distinguishing between speakers in whispered speech recordings. The design incorporates extraction techniques that effectively capture the faint spectral cues present in whispered speech, hence augmenting the diarization system's discriminative ability. Furthermore, I investigate the combination of acoustic modeling and domain-specific knowledge to enhance diarization performance in whispered speech scenarios. The suggested strategy on a variety of whispered voice datasets, contrasting its effectiveness with cutting-edge diarization techniques. The precision with which whispered speech can be divided into speaker-specific intervals using a supervised technique. Analyze the effects of various variables on diarization performance, including feature representations and dataset properties. The findings of this research contribute to advancing speaker diarization technology, particularly in challenging acoustic environments characterized by whispered speech. The proposed supervised approach holds promise for practical applications in surveillance, forensic analysis, and human-computer interaction, where accurate speaker segmentation in whispered speech recordings is essential.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2025 International Research Journal on Advanced Engineering and Management (IRJAEM)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.