Evaluation of the random forest model as a tool for predicting credit risk in university students

Main Article Content

César León Velarde
Yenso Rodrigo Lino García
Guillermo Victor Solano Rosembertt

Abstract

Credit risk for university students is one of the growing problems in the context of low financial inclusion associated with Peru. Many young people resort to informal loans or face difficulties in accessing formal credit. For such borrowers, a machine learning algorithm is applied to measure credit score assessment, with the Random Forest algorithm being popular due to its predictive capacity and ability to handle complex variables. The purpose of the study was to investigate the relevant factors of credit risk according to the socioeconomic, academic, and financial behavior of Peruvian university students, and to verify the predictions made by the RF model compared to the traditional model. The study design adopted was quantitative, basic, and non-experimental with a cross-sectional approach. Questionnaires and database inquiries were used for data collection. To support the theoretical framework, Pareto diagrams, Ishikawa diagrams, and VOS viewer were applied as analysis tools. The data were preprocessed, and the Random Forest model was trained with cross-validation using accuracy, recall, and F1 as metrics. Regarding the results obtained, the model achieved 78% accuracy in credit risk classification. The key variables were family income, payment history, credit card usage, and academic performance, demonstrating that Random Forest is a robust model for predicting credit risk compared to traditional technologies. It can be used to improve financial decision-making, reduce delinquency, and provide fairer and safer financing policies for university students

Downloads

Download data is not yet available.

Article Details

How to Cite
León Velarde, C., Lino García, Y. R. ., & Solano Rosembertt, G. V. . (2026). Evaluation of the random forest model as a tool for predicting credit risk in university students. Aula Virtual., 7(14), 1167-1195. https://doi.org/10.5281/zenodo.20394963
Section
Articles

References

Beaulac, C., & Rosenthal, J. S. (2019). Predicting University Students’ Academic Success and Major Using Random Forests. Research in Higher Education, 60(7), 1048–1064. Documento en línea. Disponible https://doi.org/10.1007/s11162-019-09546-y

Emma Howard, M. M. & Parnell, A. (2017). Contrasting Prediction Methods for Early Warning Systems at Undergraduate Leve. ArXiv [Math.HO], 2, 1–20.

Golbayani, P., Florescu, I., & Chatterjee, R. (2020). A comparative study of forecasting corporate credit ratings using neural networks, support vector machines, and decision trees. The North American Journal of Economics and Finance, 54, 101251. Documento en línea. Disponible https://doi.org/10.1016/j.najef.2020.101251

Goldstein, A., Eaton, C., Villalobos, A., Chakrabarti, P., Cohen, J., & Donnelly, K. (2023). Administrative Burden in Federal Student Loan Repayment, and Socially Stratified Access to Income-Driven Repayment Plans. RSF, 9(4), 86–111. Documento en línea. Disponible https://doi.org/10.7758/RSF.2023.9.4.04

Herrero, S., Rubio, J., & León, M. (2025). Loans to Family and Friends and the Formal Financial System in Latin America. International Journal of Financial Studies, 13(3), 116. Documento en línea. Disponible https://doi.org/10.3390/ijfs13030116

Kwamboka Mageto, D. (2015). Modelling of Credit Risk: Random Forests versus Cox Proportional Hazard Regression. American Journal of Theoretical and Applied Statistics, 4(4), 247. Documento en línea. Disponible https://doi.org/10.11648/j.ajtas.20150404.13

López Torres, V. G., Valenzuela Montoya, M. M., & Lizarraga Benítez, R. I. (2024). Educación financiera, materialismo y valor del dinero: su efecto en el endeudamiento de estudiantes universitarios. RIDE Revista Iberoamericana Para La Investigación y El Desarrollo Educativo, 15(29). Documento en línea. Disponible https://doi.org/10.23913/ride.v15i29.2015

Madaan, M., Kumar, A., Keshri, C., Jain, R., & Nagrath, P. (2021). Loan default prediction using decision trees and random forest: A comparative study. IOP Conference Series: Materials Science and Engineering, 1022(1), 012042. Documento en línea. Disponible https://doi.org/10.1088/1757-899X/1022/1/012042

Maehara, R., Benites, L., Talavera, A., Aybar-Flores, A., & Muñoz, M. (2024). Predicting Financial Inclusion in Peru: Application of Machine Learning Algorithms. Journal of Risk and Financial Management, 17(1). Documento en línea. Disponible https://doi.org/10.3390/jrfm17010034

Mestiri, S. (2024). Credit scoring using machine learning and deep Learning-Based models. Data Science in Finance and Economics, 4(2), 236–248. Documento en línea. Disponible https://doi.org/10.3934/DSFE.2024009

Monarrez, T., & Turner, L. (2024). The Effect of Student Loan Payment Burdens on Borrower Outcomes (Working Paper (Federal Reserve Bank of Philadelphia)). Federal Reserve Bank of Philadelphia. Documento en línea. Disponible https://doi.org/10.21799/frbp.wp.2024.08

Morales Castro, J. A., & Espinosa Jiménez, P. M. (2023). Factors influencing the supply of bank loans in Mexico: an analysis in the context of the 2000 to 2021 crises. Revista Academia and Negocios, 9(1), 79–94. Documento en línea. Disponible https://doi.org/10.29393/RAN9-7FIJP20007

Náñez Alonso, S., Jorge-Vazquez, J., Arias, L., & del Nogal, N. (2024). What Factors Are Limiting Financial Inclusion and Development in Peru? Empirical Evidence. Economies, 12(4), 93. Documento en línea. Disponible https://doi.org/10.3390/economies12040093

Rao, C., Liu, M., Goh, M., & Wen, J. (2020). 2-stage modified random forest model for credit risk assessment of P2P network lending to “Three Rurals” borrowers. Applied Soft Computing, 95, 106570. Documento en línea. Disponible https://doi.org/10.1016/j.asoc.2020.106570

Thuy, N. T. H., Ha, N. T. V., Trung, N. N., Binh, V. T. T., Hang, N. T., & Binh, V. T. (2025). Comparing the Effectiveness of Machine Learning and Deep Learning Models in Student Credit Scoring: A Case Study in Vietnam. Risks, 13(5). Documento en línea. Disponible https://doi.org/10.3390/risks13050099

Wu, W. (2022). Machine Learning Approaches to Predict Loan Default. Intelligent Information Management, 14(05), 157–164. Documento en línea. Disponible https://doi.org/10.4236/iim.2022.145011

Yang, H. (2023). A Random Forest Approach to Appraise Personal Credit Risk of Internet Loans. Tehnicki Vjesnik - Technical Gazette, 30(2). Documento en línea. Disponible https://doi.org/10.17559/TV-20221003064737

Zhu, L., Qiu, D., Ergu, D., Ying, C., & Liu, K. (2019). A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, 162, 503–513. Documento en línea. Disponible https://doi.org/10.1016/j.procs.2019.12.017

Most read articles by the same author(s)