원문정보
초록
영어
In this paper, an interaural time difference (ITD) estimation method is proposed for binaural speech separation in reverberant environments. First, the auditory signals are represented in the time-frequency (T-F) domain, and the ITD for each T-F bin is then estimated using generalized cross-correlation (GCC) with a maximum likelihood (ML) weighting function. In particular, the ML weighting function is designed to reduce the reverberation effect. Then, a mask is estimated by comparing the estimated ITD with the ITD corresponding to the location of the pre-defined target speech source. Finally, the target speech is separated by applying the mask to the auditory signals. It is shown that the proposed ITD estimation method outperforms a conventional cross-correlation-based ITD estimation method under reverberant conditions in terms of the signal-to-noise ratio (SNR) and signal-to-distortion ratio (SDR) of the separated speech signals.
목차
1. Introduction
2. Binaural Speech Separation
2.1. Gammatone Analysis
2.2. ITD Estimation
2.3. Mask Estimation
2.4. Speech Reconstruction
3. Proposed ML-GCC Based ITD Estimation
4. Performance Evaluation
4.1. Database
4.2. SNR and SDR Measurements
5. Conclusion
Acknowledgements
References