Analyzing the correlation of Spam Recall and Thesaurus

Kang, Sin-Jae; Kim, Jong-Wan

Analyzing the correlation of Spam Recall and Thesaurus

원문정보

한국정보기술응용학회 한국정보기술응용학회 학술대회 2005년도 6th 2005 International Conference on Computers, Communications and System 2005.11 pp.21-25

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

키워드

information filtering; spam-mail filtering; conceptual information; spam recall; thesaurus

저자정보

Kang, Sin-Jae School of Computer and Information Technology, Daegu University
Kim, Jong-Wan School of Computer and Information Technology, Daegu University

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.
※ 학술발표대회집, 워크숍 자료집 중 4페이지 이내 논문은 '요약'만 제공되는 경우가 있으니, 구매 전에 간행물명, 페이지 수 확인 부탁 드립니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle