원문정보
초록
영어
This article examines step-by-step instructions for building a quick corpus by applying data mining techniques to the popular social network service Twitter. The examples provided demonstrate this by using maritime-related vocabulary, although the same techniques could be used for any type of corpus that can be assembled by keyword searches; for example, corpora of business English or engineering English could be constructed by substituting relevant keywords. Corpus construction is performed by being recognized by Twitter, then searching for and collecting tweets containing the target words. Some standard corpus operations are then briefly explored. A function for displaying collocations is given and explained. Finally, a keyword function, to compare the maritime tweet corpus to a reference corpus and weight the words that are more likely to occur in the tweet corpus is provided.
목차
Preliminaries: Working with Twitter
Preliminaries: Working with R
Collocation function
Keyword function
Conclusion
References