동영상 시맨틱 이해를 위한 시각 동사 도출 및 액션넷 데이터베이스 구축

배창석; 김보경

원문정보

Visual Verb and ActionNet Database for Semantic Visual Understanding

배창석, 김보경

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.14 No.5 2018.10 pp.19-30 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Visual information understanding is known as one of the most difficult and challenging problems in the realization of machine intelligence. This paper proposes deriving visual verb and construction of ActionNet database as a video database for video semantic understanding. Even though development AI (artificial intelligence) algorithms have contributed to the large part of modern advances in AI technologies, huge amount of database for algorithm development and test plays a great role as well. As the performance of object recognition algorithms in still images are surpassing human’s ability, research interests shifting to semantic understanding of video contents. This paper proposes candidates of visual verb requiring in the construction of ActionNet as a learning and test database for video understanding. In order to this, we first investigate verb taxonomy in linguistics, and then propose candidates of visual verb from video description database and frequency of verbs. Based on the derived visual verb candidates, we have defined and constructed ActionNet schema and database. According to expanding usability of ActionNet database on open environment, we expect to contribute in the development of video understanding technologies.

한국어

영상 데이터에 대한 시맨틱 정보를 정확하게 이해하는 것은 인공지능 및 기계학습 분야에서 가장 어려운 도전과제의 하나로 알려져 있다. 본 논문에서는 동영상 시맨틱 이해를 위한 시각 동사 도출과 이를 바탕으로 하는 동영상 데이 터베이스인 액션넷 데이터베이스 구축에 관해 제안하고 있다. 오늘날 인공지능 기술의 눈부신 발달에는 인공지능 알 고리즘의 발전이 크게 기여하였지만 알고리즘의 학습과 성능 평가를 위한 방대한 데이터베이스의 제공도 기여한 바 가 매우 크다고 할 수 있다. 인공지능이 도전하기 어려운 분야였던 시각 정보 처리에 있어서도 정지 영상 내의 객체 인식에 있어서는 인간의 수준을 능가하기 시작하면서 점차 동영상에서의 내용에 대한 시맨틱 이해 기술 개발로 발전 하고 있다. 본 논문에서는 이러한 동영상 이해를 위한 학습 및 테스트 데이터베이스로서 액션넷 구축에 요구되는 시 각 동사의 후보를 도출한다. 이를 위해 언어학 기반의 동사 분류체계를 살펴보고, 영상에서의 시각 정보를 명세한 데이터 및 언어학에서의 시각 동사 빈도 등으로부터 시각 동사의 후보를 도출한다. 시각 동사 분류체계와 시각 동사 후보를 바탕으로 액션넷 데이터베이스 스키마를 정의하고 구축한다. 본 논문에서 제안하는 시각 동사 및 스키마와 이를 바탕으로 하는 액션넷 데이터베이스를 개방형 환경에서 확장하고 활용성을 제고함으로써 동영상 이해 기술 발 전에 기여할 수 있을 것으로 기대한다.

요약
Abstract
1. 서론
2. 관련 연구
2.1 언어학 기반 관련 연구 사례
2.2 동영상 데이터베이스 구축 연구 사례
3. 시각 동사 후보 도출
4. 액션넷 온톨로지 구축
4.1 객체와 시각 동사의 추가
4.2 동영상에서의 시각 동사 태깅 방법
4.3 동영상에서의 시각 동사 태깅 결과 데이터
5. 결론
참고문헌

earticle

동영상 시맨틱 이해를 위한 시각 동사 도출 및 액션넷 데이터베이스 구축

원문정보

초록

목차

키워드

저자정보

참고문헌

함께 이용한 논문