원문정보
초록
영어
Reverberation degrades speech quality, and impairs speech intelligibility. This degradation can also cause difficulties in the process of analyzing speech signals and conducting scientific investigations. In addition, in case of reverberant speech, since the performance of speech recognition is degraded, dereverberation technique is widely employed as a preprocessing. In this paper, we compare the performance of various neural vocoders in a dereverberation technique based on convolutional neural network(CNN). The U-Net architecture was utilized for dereverberation, and WaveGlow, MelGAN, and Griffin Lim were employed as vocoders. These vocoders have a role of receiving speech features as input and reconstruct to speech signals in time-domain. In particular, recent neural vocoders receive mel-spectrogram as an input feature and can reconstruct to high-quality speech signals. To compare the performance of the neural vocoder, we measured perceptual evaluation of speech quality(PESQ), and it was confirmed that all values were relatively high compared to the existing reverberant signals.
목차
Ⅰ. 서론
Ⅱ. 합성곱 신경망 기반의 음성 잔향 제거
Ⅲ. 성능 평가
Ⅳ. 결론
Ⅴ. 사사
Ⅵ. 참고문헌