원문정보
초록
영어
We designed an explainable AI system to predict diabetes by integrating ensemble learning models and interpretability tools. Traditional diagnostic models often lack transparency, making them less suitable for clinical applications where interpretability is essential. This study aims to design an explainable artificial intelligence (AI) system for diabetes prediction that balances predictive accuracy and interpretability. To achieve this, we developed an ensemble model combining Random Forest, XGBoost, and Logistic Regression within a Voting Classifier framework. The Synthetic Minority Oversampling Technique (SMOTE) was employed to address class imbalance in the dataset, ensuring reliable predictions across both majority and minority classes. For interpretability, SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-Agnostic Explanations) were integrated into the system to provide global and local explanations of model predictions. Experimental results demonstrated the ensemble model's high performance, achieving a recall score of 0.846 and an AUC-ROC score of 0.874, which are crucial metrics in minimizing false negatives in medical diagnoses. Key features such as BMI, glucose level, and age were identified as significant contributors to diabetes risk. The integration of explainability tools ensures that healthcare professionals can understand both overarching patterns and patient-specific predictions, fostering trust in clinical decisionmaking. This approach bridges the gap between complex machine learning models and practical medical applications, offering a robust and transparent tool for improving patient outcomes.
목차
1. Introduction
2. Literature Review
3. Methodology
3.1 Data Preprocessing
3.2 Model Architecture
3.3 Explainability Techniques
3.4 Implementation
4. Results and Discussion
4.1 Model Performance
4.2 Explainability Analysis
4.3 Ensemble Model Complexity and Clinical Applicability
5. Conclusion
6. Discussion
References
