Student Success Predictor
Built on the Open University Learning Analytics Dataset (OULAD) with 32,593 students across 7 modules. Predicts four outcomes — Pass, Fail, Distinction, Withdrawn — at the halfway point of a module, when there’s still time to intervene.
XGBoost + SMOTE achieves 67.7% accuracy and 0.877 macro ROC-AUC. For early intervention, the Withdrawn class is most important: 79% precision, 74% recall. SHAP explainability surfaces the top predictors (VLE activity patterns, assessment submission timing) in terms interpretable to tutors.
K-Means behavioural clustering (silhouette 0.597) groups students into 4 engagement profiles, enabling personalised interventions rather than a single generic alert. The pipeline follows CRISP-DM end to end: business understanding → data preparation → modelling → evaluation → deployment readiness.