Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/2193
Title: Using data mining technique to predict student dropout in St. Mary’s University College: Its implication to quality of education
Authors: Semeon, Getahun
Keywords: Dropout,
Decision Tree,
J48,
RandomForest,
Neural Network Multilayerperceptron,
Higher Education Institutions
Data mining,
Classification,
Feature Selection,
Issue Date: Aug-2011
Publisher: St.Mary's University
Abstract: One of the major challenges of PHEIs that affect their performance is the increasing number of dropouts. In order to solve this problem, PHEIs must identify the dropout trends and the major determinants of higher dropout rates. Data mining is becoming a new source of data for higher education institutions that can be used as a means to identify trends of dropout and its possible determinants. Despite the expansion of Private Higher Education Institutions (PHEIs) and enrollment of students in both undergraduate and postgraduate programs, there is high and increasing dropout rate in both private and public HEIs of Ethiopia. The challenge is even more significant in private HEIs. An extensive literature search did not show any study conducted in the areas of application of data mining or other technique to predict dropout within the context of Ethiopian HEIs and other low income countries. Therefore, demonstrating the possibility of applying data mining technique in the areas of student dropout within the context of Ethiopian HEIs is quite relevant and innovative. This study is concerned with applying data mining technique for better and on time prediction of dropout of degree students. The basic research question of the study is: Can the traditional machine learning be applied to rank students by their likelihood to dropout? Classification and feature selection algorithms have been used to build the prediction models. One R, RandomForest and Neural Network (Multi-layerperceptron) demonstrated the highest performance in terms of highest percentage of correct classification. The accuracy of the classifiers ranges between 87% and 94.5%. CGPA is selected as the strongest predictor of dropout which is followed by Term1 and Term2 GPAs. Age and previous college result are in the fourth and fifth place in terms of their predictive power.
URI: http://hdl.handle.net/123456789/2193
Appears in Collections:Proceedings of the 9th National Conference on Private Higher Education Institutions (PHEIs) in Ethiopia

Files in This Item:
File Description SizeFormat 
Getahun Semeon.pdf1.04 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.