Smart Choice: Machine Learning Insight into Factors Influencing Students’ Programme Selection at the Tertiary Institution

Authors

DOI:

https://doi.org/10.64539/sjcs.v2i1.2026.397

Keywords:

Educational data mining (EDM), Machine learning (ML), Artificial Intelligence (AI), K-prototype, Students, Programme Choice

Abstract

Understanding the factors influencing students’ choice of programme of study is increasingly important for tertiary institutions in Ghana, particularly amid rising enrolment rates and growing competition. While prior studies have applied machine learning to predict academic performance, limited research has examined programme selection behaviour at the senior high school level using mixed-type clustering techniques. This study addresses this gap by applying the K-prototype clustering algorithm and supervised classification models to survey data collected from 1,042 final-year Business and Home Economics students across ten senior high schools in Northern Ghana. The clustering process identified three behavioural segments comprising 423, 382, and 237 students, respectively, with the majority aged 16–20 years. Internal validation metrics indicated modest cluster separation. Subsequent classification modelling using Naïve Bayes, Logistic Regression, Decision Tree (J48), Random Forest, and Support Vector Machine (SVM) showed that SVM achieved the highest predictive performance (Accuracy = 99%) when predicting cluster membership. Key influencing factors included parental education, parental occupation, counselling exposure, socio-cultural beliefs, and peer influence. The findings highlight the need for strengthened, context-sensitive guidance and counselling frameworks at the pre-tertiary level to support informed and independent programme selection decisions.

References

[1] K. M. Badau, “Factors influencing the choice of tertiary education institutions in Nigeria,” Journal of Resourcefulness and Distinction, vol. 6, no. 1, pp. 1–13, 2013. https://www.globalacademicgroup.com/journals/resourcefulness/Factors%20Influencing%20the%20Choice%20of%20Tertiary%20Education.pdf.

[2] R. Faek, “International student mobility in Sub-Saharan Africa, Part 3: Trends in Ghana,” World Education News & Reviews. https://wenr.wes.org/2024/10/international-student-mobility-in-sub-saharan-africa-part-3-trends-in-ghana.

[3] S. Bawakyillenuo, I. O. Akoto, C. Ahiadeke, E. B. D. Aryeetey, and E. K. Agbe, “Tertiary education and industrial development in Ghana,” Policy Brief, vol. 33012, pp. 1–13, 2013. https://www.theigc.org/sites/default/files/2015/02/Bawakyillenuo-Et-Al-2013-Working-Paper.pdf.

[4] Y. Nieto, V. Gacia-Diaz, C. Montenegro, C. C. Gonzalez, and R. G. Crespo, “Usage of Machine Learning for Strategic Decision Making at Higher Educational Institutions,” IEEE Access, vol. 7, pp. 75007–75017, 2019, https://doi.org/10.1109/ACCESS.2019.2919343.

[5] H. M. Ibrahim, A. N. Yousif, and R. D. Resen, “Determining the Relative Importance of Factors Affecting the Selection of High School Students for University Colleges Using Machine Learning Algorithms,” International Journal of Computer Science and Mobile Computing, vol. 12, no. 3, pp. 40–48, Mar. 2023, https://doi.org/10.47760/ijcsmc.2023.v12i03.005.

[6] A. M. TURING, “I.—Computing Machinery and Intelligence,” Mind, vol. LIX, no. 236, pp. 433–460, Oct. 1950, https://doi.org/10.1093/mind/LIX.236.433.

[7] A. Thorat and P. Mohite Rohan Hanbar, “MACHINE LEARNING AND ITS APPLICATIONS,” Journal of Emerging Technologies and Innovative Research, vol. 10, no. 5, 2023. https://www.jetir.org/papers/JETIR2305274.pdf.

[8] C. Romero and S. Ventura, “Educational Data Mining: A Review of the State of the Art,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 40, no. 6, pp. 601–618, Nov. 2010, https://doi.org/10.1109/TSMCC.2010.2053532.

[9] R. S. Baker and P. S. Inventado, “Educational Data Mining and Learning Analytics,” in Learning Analytics, New York, NY: Springer New York, 2014, pp. 61–75. https://doi.org/10.1007/978-1-4614-3305-7_4.

[10] H. Luo, “Prediction of Student Decision-Making Behaviour based on Machine Learning Algorithms,” Pakistan Journal of Life and Social Sciences (PJLSS), vol. 22, no. 2, 2024, https://doi.org/10.57239/PJLSS-2024-22.2.001188.

[11] I. E. Livieris, T. A. Mikropoulos, and P. Pintelas, “A decision support system for predicting students’ performance,” Themes in Science and Technology Education, vol. 9, no.1, pp. 43-57, 2016. https://www.learntechlib.org/p/174254/.

[12] A. Pandey and A. Jain, “Comparative Analysis of KNN Algorithm using Various Normalization Techniques,” International Journal of Computer Network and Information Security, vol. 9, no. 11, pp. 36–42, Nov. 2017, https://doi.org/10.5815/ijcnis.2017.11.04.

[13] K. O. Adefemi, M. B. Mutanga, and V. Jugoo, “Hybrid Deep Learning Models for Predicting Student Academic Performance,” Mathematical and Computational Applications, vol. 30, no. 3, p. 59, May 2025, https://doi.org/10.3390/mca30030059.

[14] J. Arruarte, M. Larrañaga, A. Arruarte, and J. A. Elorriaga, “Measuring the Quality of Test-based Exercises Based on the Performance of Students,” Int. J. Artif. Intell. Educ., vol. 31, no. 3, pp. 585–602, Sep. 2021, https://doi.org/10.1007/s40593-020-00208-0.

[15] R. Wirth and J. Hipp, “CRISP-DM: Towards a standard process model for data mining,” Proceedings of the 4th International Conference on the Practical Applications of Knowledge Discovery and Data Mining, pp. 29–39, 2000. https://www.cs.unibo.it/~danilo.montesi/CBD/Beatriz/10.1.1.198.5133.pdf.

[16] Z. Huang, “Clustering large data sets with mixed numeric and categorical values.,” in In Proceedings of the First Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 1997, pp. 21–34.

[17] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–291, 1967. https://books.google.co.id/books?id=IC4Ku_7dBFUC.

[18] Z. Huang, “Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values,” Data Min. Knowl. Discov., vol. 2, no. 3, pp. 283–304, Sep. 1998, https://doi.org/10.1023/A:1009769707641.

[19] T. Devasia, Vinushree T P, and V. Hegde, “Prediction of students performance using Educational Data Mining,” in 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), IEEE, Mar. 2016, pp. 91–95. https://doi.org/10.1109/SAPIENCE.2016.7684167.

[20] M. Shah, C. Sid Nair, and L. Bennett, “Factors influencing student choice to study at private higher education institutions,” Quality Assurance in Education, vol. 21, no. 4, pp. 402–416, Sep. 2013, https://doi.org/10.1108/QAE-04-2012-0019.

[21] N. A. Sarkodie, A. Asare, and D. Asare, “Factors influencing students’ choice of tertiary education,” Africa Development and Resources Research Institute Journal, vol. 28, no. (11)(5), pp. 58–92, 2020. https://www.researchgate.net/publication/343318494.

[22] S. K. Yadav and S. Pal, “Data Mining: A Prediction for Performance Improvement of Engineering Students using Classification,” arXiv preprint, arXiv:1203.3832, 2012. https://doi.org/10.48550/arXiv.1203.3832.

Downloads

Published

2026-03-11

How to Cite

Yakubu, A. M., & Ofori, E. (2026). Smart Choice: Machine Learning Insight into Factors Influencing Students’ Programme Selection at the Tertiary Institution. Scientific Journal of Computer Science, 2(1), 107–118. https://doi.org/10.64539/sjcs.v2i1.2026.397