文摘
Voice/non-voice detection refers to the task of detecting the presence or absence of vocal folds activity regions in the speech signal. Most of the existing state-of-the-art methods depend exclusively on the amplitude of the signal either in time or frequency domains, and their performance is significantly affected for weakly voiced, laryngeal transitions and noisy segments of speech. In this paper, we propose a robust method for detecting voice/non-voice regions in the speech signal based on the harmonics of the phase of the source signal. Here, the source signal is derived by removing the effect of vocal tract resonances from the speech signal by using zero frequency filtering (ZFF). The experimental results demonstrate the robustness of the proposed method for accurate detection of voiced/non-voiced regions in the speech signal during adverse conditions. The performance of the proposed method is compared with one of the state-of-the-art methods based on sum of residual harmonics, and three well known standard voice activity detection (VAD) algorithms: G729B, adaptive multi-rate VAD option-1 (AMR1) and adaptive multi-rate VAD option-2 (AMR2).