원문정보
초록
영어
Content-based email spam filtering is a challenging problem in which emails are often represented as high-dimensional data. This paper proposes an approach to email spam filtering based on max-margin semi-NMF (MNMF). MNMF combines the ideas of semi-NMF and max-margin and performs dimension reduction and classification simultaneously. In MNMF, we employ the same approach as Semi-NMF to update the coefficient matrix (while the other parameters are fixed) instead of quadratic programming. Simulation experiments were performed on two public Chinese email corpuses. The results show that MNMF is much faster and performs much better than support vector machine (SVM) classifiers that use features extracted by principal component analysis or linear discriminant analysis, and the MNMF method also outperforms SVM classification schemes in combination with feature extractions based on NMF and Semi-NMF
목차
1. Introduction
2. NMF and Semi-NMF
2.1. NMF
2.2 Semi-NMF
3. Main Title
4. Experiments
4.1. Data Sets and Evaluation Metrics
4.2. Designs and Results Analysis
5. Conclusions and Future Work
References
