Voice Activity Detection Based on SNR and Non-Intrusive Speech Intelligibility Estimation

Soo Jeong An; Seung Ho Choi

Voice Activity Detection Based on SNR and Non-Intrusive Speech Intelligibility Estimation

원문정보

국제인공지능학회(구 한국인터넷방송통신학회) International Journal of Internet, Broadcasting and Communication Vol.11 No.4 2019.11 pp.26-30 KCI 등재후보

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

This paper proposes a new voice activity detection (VAD) method which is based on SNR and non-intrusive speech intelligibility estimation. In the conventional SNR-based VAD methods, voice activity probability is obtained by estimating frame-wise SNR at each spectral component. However these methods lack performance in various noisy environments. We devise a hybrid VAD method that uses non-intrusive speech intelligibility estimation as well as SNR estimation, where the speech intelligibility score is estimated based on deep neural network. In order to train model parameters of deep neural network, we use MFCC vector and the intrusive speech intelligibility score, STOI (Short-Time Objective Intelligent Measure), as input and output, respectively. We developed speech presence measure to classify each noisy frame as voice or non-voice by calculating the weighted average of the estimated STOI value and the conventional SNR-based VAD value at each frame. Experimental results show that the proposed method has better performance than the conventional VAD method in various noisy environments, especially when the SNR is very low.

키워드

저자정보

Soo Jeong An Dept. of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Seoul, Korea
Seung Ho Choi Dept. of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Seoul, Korea

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

0개의 논문이 장바구니에 담겼습니다.

earticle