원문정보
초록
영어
This paper discuss different classification methods toward reliability and quality improvement of software systems by predicting fault-prone module before testing. Classification capability of Data mining techniques and Object-oriented property based knowledge stored in Object-Oriented metrics are used to classify the software module as fault-prone in different error categories or not fault-prone. Three versions of Eclipse, the java-based Open source Integrated Development environment as dataset for training and testing all the classification based data mining techniques are used. First of all, Threshold base feature ranking (i.e. Area under the ROC curve) is used for selecting effective OO-metrics in building prediction model. After that using those subsets of selected attributes, classification models are built with 41 different classifiers for multinomial classification in fault detection systems. Finally, the performance of a classifier is evaluated with respect to the PRC performance metric. Based on the performance results appropriate classifiers (Random Committee, Random Tree, Randomizable Filtered classifier and IBK) which depict a higher performance and accuracy compared to the others are selected. Our results indicate that Random Tree, Random Committee and Randomizable Filtered Classifier have same performance. IBK classifier also has same performance but little bit less and Kstar has less performance compared to previous four selected classifiers.
목차
1. Introduction
2. Related Work
3. Research Background
3.1. Dataset Description
3.2. Software Metrics Studied
3.3. Collection of Error Data
3.4. Empirical Data Collection
4. Research Method
4.1. Threshold Based Feature Ranking (TBFR)
4.2. Classification
4.3. Performance Measures
5. Experimental Analysis
5.1. TBFR Method for Feature Selection
5.2 Results and Discussion
5.3 Random Committee, Random Tree, Randomizable Filtered Classifier, IBk and K-star Performance Evaluation
6. Conclusion
References
