

Classification of Malicious Domain Names using Support Vector Machine and Bi-gram Method



Everyday there are millions of domains registered and some of them are related to malicious activities. Recently, domain names have been used to operate malicious networks such as botnet and other types of malicious software (malware). Studies have revealed that it was challenging to keep track of malicious domains by Web content analysis or human observation because of the large number of domains. Legitimate domain names usually consist of English words or other meaningful sequences and can be easy to understand by humans, while malicious domains are generated randomly and do not include meaningful words or are not otherwise readable. Recently, a classification method has been proposed to classify malicious domain names. They used many features from DNS queries, including some textual features. However, it seems difficult to collect and maintain those data. Our contribution is that, by using only domain names we could achieve better classification results, thus showing that domain names themselves contain enough information for classification.


 1. Introduction
 2. Background and Related Work
  2.1. DNS Concept
  2.2. DNS Queries
  2.3. Related Work
 3. Data Sets and Feature Extraction
  3.1. Data Collection
  3.2. Constructing the Dataset
  3.3. Feature Extraction
 4. SVM Classifier
  4.1. SVM
  4.2. SVM light
 5. Result and Discussion
 6. Conclusion


  • Nhauo Davuth Konkuk University, South Korea
  • Sung-Ryul Kim Konkuk University, South Korea


자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 원문제공기관과의 협약기간이 종료되어 열람이 제한될 수 있습니다.

      0개의 논문이 장바구니에 담겼습니다.