earticle

논문검색

A toolkit for creating a maritime corpus of tweets for language analysis

원문정보

초록

영어

This article examines step-by-step instructions for building a quick corpus by applying data mining techniques to the popular social network service Twitter. The examples provided demonstrate this by using maritime-related vocabulary, although the same techniques could be used for any type of corpus that can be assembled by keyword searches; for example, corpora of business English or engineering English could be constructed by substituting relevant keywords. Corpus construction is performed by being recognized by Twitter, then searching for and collecting tweets containing the target words. Some standard corpus operations are then briefly explored. A function for displaying collocations is given and explained. Finally, a keyword function, to compare the maritime tweet corpus to a reference corpus and weight the words that are more likely to occur in the tweet corpus is provided.

목차

Introduction
 Preliminaries: Working with Twitter
 Preliminaries: Working with R
 Collocation function
 Keyword function
 Conclusion
 References
 

저자정보

  • Kevin Parent Korea Maritime University

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.