원문정보
초록
영어
For many years natural language processing (NLP) programming tools have been used to process information in various applications areas including medicine. However, most of such systems have been developed by expert programmers and very little or none by clinicians. The subject under consideration in this article is automatic categorization of clinical data. This topic requires great deal of clinical cognition and hence there is a need to let clinicians develop such systems. This article is an attempt in this direction where the RapidMiner environment has been used for this purpose. This article describes how RapidMiner as a visual programming environment can be used for tokenization and categorization of clinical narratives. It also describes how to select the best classifier for categorization. K-NN classifier categorizes clinical narratives with high performance accuracies even for large dataset like the i2b2 smoking challenge data.
목차
1. Introduction
2. RapidMiner for Tokenization and Categorization
3. Categorizing Clinical Narratives
4. Validating the Categorization Abilility of the K-NN
5. Discussion and Conclusion
Acknowledgements
References