earticle

논문검색

Analyzing the correlation of Spam Recall and Thesaurus

초록

영어

In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

목차

Abstract
 1. Introduction
 2. Training Phase
  2.1. Definite Information
  2.2. Less Definite Information 
  2.3. Kadokawa Thesaurus
  2.4. Constructing Feature Vectors 
 3. Applying Phase
 4. Experiments
 5. Conclusion
 References

저자정보

  • Kang, Sin-Jae School of Computer and Information Technology, Daegu University
  • Kim, Jong-Wan School of Computer and Information Technology, Daegu University

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.
      ※ 학술발표대회집, 워크숍 자료집 중 4페이지 이내 논문은 '요약'만 제공되는 경우가 있으니, 구매 전에 간행물명, 페이지 수 확인 부탁 드립니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.