성대영상분석을 통한 모음 음소분류기

Rodrigo Picinini Méxas; Unsang Park

Oral Session Ⅳ AI : 음성 , 텍스트 분석

성대영상분석을 통한 모음 음소분류기

원문정보

Vowels Phoneme Classifier through Vocal Tract Image Analysis

Rodrigo Picinini Méxas, Unsang Park

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 학술대회 2023 한국차세대컴퓨팅학회 춘계학술대회 2023.06 pp.163-165

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

The identification and association of phonemes from different shapes of the vocal tract can be used for multiple tasks such as diagnosing pronunciation difficulties in patients through the analysis of images like the MRI. However, the lack of reliable vocal tract datasets makes tasks like those hard to be accomplished. Through this paper, an initial proposal on how to make a vocal tract dataset is made and how it could be potentially applied for classifying phonemes. For the creation of the dataset the Vocal Tract Lab Python API was utilized, and those generated images were used as input for training the classifier. The vocal tract images were made from different ages and genders. Only phonemes representing vowels are analyzed and the quantity of the images created for the training are small, which made the test results from the phoneme classification fluctuate in each training run. Still, the current work represents an initial step towards new works in this direction.

키워드

저자정보

Rodrigo Picinini Méxas Department of Computer Science and Engineering, Sogang University
Unsang Park Department of Computer Science and Engineering, Sogang University

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

0개의 논문이 장바구니에 담겼습니다.

earticle

성대영상분석을 통한 모음 음소분류기

원문정보

초록

목차

키워드

저자정보

참고문헌

함께 이용한 논문