PointNet을 이용한 3D 한국어 단어 인식 Lip-Reading

신성훈; 허제; 이해욱; 오혁준

PointNet을 이용한 3D 한국어 단어 인식 Lip-Reading

원문정보

3D Lip-Reading for Korean Word Recognition Using PointNet

신성훈, 허제, 이해욱, 오혁준

국제차세대융합기술학회 차세대융합기술학회논문지 제7권 12호 2023.12 pp.2022-2028 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

There are many studies that recognize English mainly through images in the existing lip reading. This paper proposes a lip reading method that recognizes through 10 Korean word data, and proposes a 3D lip reading method using ToF Sensor as well as an image. The data used in this paper are RGB image data and 3D data in the form of Point Cloud, respectively, and the learning is conducted and the results are shown. The networks used for learning used a VGGNet-based network to use RGB image data and a PointNet-based network to learn Point Cloud data. Each network tuned hyperparameters and neural networks to match the data used. It can be seen that depth information played an important role in lip reading from the fact that the result value of the network using 3D data was the highest.

한국어

기존의 립리딩은 주로 영상을 통해 영어를 인식하는 연구가 많다. 본 논문은 10가지의 한글 단어 데이터 를 통해 인식하는 립리딩 방법을 제안하고, 영상만이 아니라 ToF Sensor를 이용한 3차원 립리딩 방법을 제안한 다. 본 논문에서 사용한 데이터는 RGB 영상 데이터와 Point Cloud 형식의 3차원 데이터로 각각 학습을 진행하고 결과를 보여준다. 학습에 사용된 네트워크는 RGB 영상 데이터를 사용하기 위한 VGGNet 기반의 네트워크와. Point Cloud 데이터를 학습시키기 위한 PointNet 기반의 네트워크를 사용했다. 각각의 네트워크는 사용된 데이터 에 맞게 하이퍼 파라미터와 신경망 튜닝을 진행했다. 3차원의 데이터를 사용한 네트워크의 결과가 제일 높게 나왔 다는 것을 통해서 립리딩에 깊이 정보가 중요하게 작용했음을 알 수 있다.

키워드

저자정보

신성훈 Sung-Hoon Shin. LIG넥스원 연구원
허제 Je Heo. LIG넥스원 연구원
이해욱 Hae-Uk Lee. 광운대학교 전자통신공학과 학생
오혁준 Hyuk-Jun Oh. 광운대학교 전자통신공학과 교수

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle