Machine Learning Algorithms with Parameter Tuning to Predict Students’ Graduation-on-time: A Case Study in Higher Education

Abstract
This study aims to predict a student’s graduation on time (GOT) using machine learning algorithms. We applied five different machine learning algorithms, namely Random Forest, Support Vector Machine (Linear Kernel), Support Vector Machine (Polynomial Kernel), K-Nearest Neighbors, and Naïve Bayes. These algorithms were tested using 10-fold cross validation and simulated various parameter tuning values. The results show that the Random Forest algorithm produces the best accuracy and kappa statistics values, so this algorithm is suitable for modeling predictive data of students graduating on time. This predictive model is expected to be useful for higher education management in designing their strategies to assist and improve student academic performance weaknesses in order to achieve graduation on time.
Downloads
References
Ahmad Tarmizi, S. S., Mutalib, S., Abdul Hamid, N. H., Abdul-Rahman, S., & Md Ab Malik, A. (2019). A Case Study on Student Attrition Prediction in Higher Education Using Data Mining Techniques. Communications in Computer and Information Science, 1100, 181–192. https://doi.org/10.1007/978-981-15-0399-3_15/COVER
Aiken, J. M., de Bin, R., Hjorth-Jensen, M., & Caballero, M. D. (2020). Predicting time to graduation at a large enrollment American university. PLOS ONE, 15(11), e0242334. https://doi.org/10.1371/JOURNAL.PONE.0242334
Asyraf, A. S., Abdul-Rahman, S., & Mutalib, S. (2017). Mining textual terms for stock market prediction analysis using financial news. Communications in Computer and Information Science, 788, 293–305. https://doi.org/10.1007/978-981-10-7242-0_25/COVER
Gerhana, Y. A., Fallah, I., Zulfikar, W. B., Maylawati, D. S., & Ramdhani, M. A. (2019). Comparison of naive Bayes classifier and C4.5 algorithms in predicting student study period. Journal of Physics: Conference Series, 1280(2), 022022. https://doi.org/10.1088/1742-6596/1280/2/022022
Gunawan, Hanes, & Catherine. (2021). C4.5, K-Nearest Neighbor, Naïve Bayes and Random Forest Algorithms Comparison to Predict Students’ On Time Graduation. Indonesian Journal of Artificial Intelligence and Data Mining (IJAIDM), 4(2), 62–71. https://doi.org/10.24014/ijaidm.v4i2.10833
Han, J., Pei, J., & Tong, H. (2023). Data Mining: Concepts and Techniques. In Morgan Kaufmann. In Morgan Kaufmann.
Hutt, S., Gardner, M., Duckworth, A. L., & D’Mello, S. K. (2019). Evaluating Fairness and Generalizability in Models Predicting On-Time Graduation from College Applications. International Educational Data Mining Society.
Kesumawati, A., & Utari, D. T. (2018). Predicting patterns of student graduation rates using Naïve bayes classifier and support vector machine. AIP Conference Proceedings, 2021(1), 060005. https://doi.org/10.1063/1.5062769
Lagman, A. C., Alfonso, L. P., Goh, M. L. I., Lalata, J. A. P., Magcuyao, J. P. H., & Vicente, H. N. (2020). Classification algorithm accuracy improvement for student graduation prediction using ensemble model. International Journal of Information and Education Technology, 10(10), 723–727. https://doi.org/10.18178/IJIET.2020.10.10.1449
Lagman, A. C., Gonzales, J. G., Ramos, R. F., Calleja, J. Q., Legaspi, J. B., Solomo, M. V. S., Fernando, C. G., Ortega, J. H. J. C., & Santos, R. C. (2019). Embedding naïve bayes algorithm data model in predicting student graduation. ACM International Conference Proceeding Series, 51–56. https://doi.org/10.1145/3369555.3369570
Matundura Ogwoka, T., Wilson Cheruiyot, K., & George Okeyo, K. (2015). A Model for Predicting Students’ Academic Performance using a Hybrid of K-means and Decision tree Algorithms. International Journal of Computer Applications Technology and Research, 4(9), 693–697. www.ijcat.com693
Mohammad Suhaimi, N., Abdul-Rahman, S., Mutalib, S., Abdul Hamid, N. H., & Md Ab Malik, A. (2019). Predictive Model of Graduate-On-Time Using Machine Learning Algorithms. Communications in Computer and Information Science, 1100, 130–141. https://doi.org/10.1007/978-981-15-0399-3_11/COVER
Pang, Y., Judd, N., O’Brien, J., & Ben-Avie, M. (2017). Predicting students’ graduation outcomes through support vector machines. Proceedings - Frontiers in Education Conference, FIE, 2017-October, 1–8. https://doi.org/10.1109/FIE.2017.8190666
Pradipta, A., Hartama, D., Wanto, A., Saifullah, S., & Jalaluddin, J. (2019). The Application of Data Mining in Determining Timely Graduation Using the C45 Algorithm. IJISTECH (International Journal of Information System and Technology), 3(1), 31–36. https://doi.org/10.30645/IJISTECH.V3I1.30
Salim, A. P., Laksitowening, K. A., & Asror, I. (2020). Time Series Prediction on College Graduation Using KNN Algorithm. 2020 8th International Conference on Information and Communication Technology, ICoICT 2020. https://doi.org/10.1109/ICOICT49345.2020.9166238
Satyanarayana Reddy, G., Srinivasu, R., Poorna, M., Rao, C., & Rikkula, S. R. (2010). Data Warehousing, Data Mining, OLAP And OLTP Technologies Are Essential Elements To Support Decision-Making Process In Industries. (IJCSE) International Journal on Computer Science and Engineering, 02(09), 2865–2873. http://pwp.starnetinc.com/larryg/articles.html
Solichin, A. (2019). Comparison of decision tree, Naïve Bayes and K-nearest neighbors for predicting thesis graduation. International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), 217–222. https://doi.org/10.23919/EECSI48112.2019.8977081
Stephen Bassi, J., Gbenga Dada, E., Abdulkadir Hamidu, A., Dauda Elijah, M., & Author, C. (2019). Students Graduation on Time Prediction Model Using Artificial Neural Network. 21(3), 28–35. https://doi.org/10.9790/0661-2103012835
Sugiharti, E., Firmansyah, S., & Devi, F. R. (2017). Predictive Evaluation of Performance of Computer Science Students of UNNES using Data Mining based on Naïve Bayes Classifier (NBC) Algorithm. Journal of Theoretical and Applied Information Technology, 28(4). www.jatit.org
Tampakas, V., Livieris, I. E., Pintelas, E., Karacapilidis, N., & Pintelas, P. (2019). Prediction of students’ Graduation time using a two-level classification algorithm. Communications in Computer and Information Science, 993, 553–565. https://doi.org/10.1007/978-3-030-20954-4_42/COVER
Weerts, H. J. P., Mueller, A. C., & Vanschoren, J. (2020). Importance of Tuning Hyperparameters of Machine Learning Algorithms. https://doi.org/10.48550/arxiv.2007.07588
Yue, H., & Fu, X. (2017). Rethinking Graduation and Time to Degree: A Fresh Perspective. Research in Higher Education, 58(2), 184–213. https://doi.org/10.1007/S11162-016-9420-4/METRICS
Copyright (c) 2022 Rizal Bakri, Niken Probondani Astuti, Ansari Saleh Ahmar (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.