A toolkit for creating a maritime corpus of tweets for language analysis

Kevin Parent

A toolkit for creating a maritime corpus of tweets for language analysis

원문정보

Kevin Parent

한국해양대학교 세계해양발전연구소 세계해양발전연구 제26권 2017.02 pp.207-215

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

This article examines step-by-step instructions for building a quick corpus by applying data mining techniques to the popular social network service Twitter. The examples provided demonstrate this by using maritime-related vocabulary, although the same techniques could be used for any type of corpus that can be assembled by keyword searches; for example, corpora of business English or engineering English could be constructed by substituting relevant keywords. Corpus construction is performed by being recognized by Twitter, then searching for and collecting tweets containing the target words. Some standard corpus operations are then briefly explored. A function for displaying collocations is given and explained. Finally, a keyword function, to compare the maritime tweet corpus to a reference corpus and weight the words that are more likely to occur in the tweet corpus is provided.

키워드

저자정보

Kevin Parent Korea Maritime University

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle