earticle

논문검색

Poster Session Ⅱ

Improving Speaker Recognition with Parallel WaveGAN

초록

영어

In recent years, Generative Adversarial Networks (GANs) appeared as a prevailing solution for combating data scarcity in various domains. This study delves into utilizing WaveGAN, a specialized GAN architecture, to address the inherent challenges stemming from the limited availability of audio datasets. Our primary objective is to tackle the issue of constrained audio data resources by utilizing the potential of WaveGAN. Our research is driven by the overarching goal of investigating the capacity of CNN to gather significant insights from an extensive corpus of human speech data. A key focus of our work is to demonstrate the effectiveness of WaveGAN in generating synthetic audio data, thereby expanding the breadth of our audio dataset and bolstering the resilience of our classification models. Our study aims to yield improved classification results, providing crucial insights into the viability of this approach in alleviating data scarcity challenges of audio analysis.

목차

Abstract
I. INTRODUCTION
II. METHOD
A. Extraction of MFCC
B. Parallel WaveGAN
III. EXPERIMENTAL RESULTS
A. Dataset
B. CNN Architecture
C. Results
IV. CONCLUSION
ACKNOWLEDGMENT
REFERENCES

저자정보

  • Kim Dong Jun Sejong University
  • Habib Khan Sejong University
  • Hikmat Yar Sejong University

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      0개의 논문이 장바구니에 담겼습니다.