Telecom Voice Traffic Termination Fraud Detection Using Ensemble Learning: The Case of Ethio Telecom

Getahun, Alemeshet

Full metadata record

DC Field	Value	Language
dc.contributor.author	Getahun, Alemeshet	-
dc.date.accessioned	2021-09-24T06:36:39Z	-
dc.date.available	2021-09-24T06:36:39Z	-
dc.date.issued	2020-07	-
dc.identifier.uri	.	-
dc.identifier.uri	http://hdl.handle.net/123456789/6226	-
dc.description.abstract	One of the major developments in machine learning is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this thesis, ensemble classification methods were proposed. This proposed model provides the important information which can be used for decision making. A comparison study was also made for finding the suitable classifier on an ensemble technique used in the proposed model We selected around 126736 records from two months’ collection of call detail record data. After eliminating irrelevant and unnecessary data, a total of 50516 datasets were used for the purpose of conducting this study. The researcher also selected 10 attributes for this study based on their relevant for this research. Data preprocessing was done to clean the datasets. After data preprocessing, the collected data has been prepared in a format suitable for the DM tasks. The study was conducted using Waikato environment for knowledge analysis (WEKA) version 3.8.3 machine learning software and four ensemble based machine learning paradigms for classification techniques was used, namely boosting, bagging, stacking and voting classifiers, based on 2 basic learners (decision tree and neural network) algorithms. The training models are built using cross validation and tested for reliability by default values of percentage split (66%). The performances of the model in this study were evaluated using the standard metrics of prediction accuracy, error rate analysis, FP rate, TP rate, recall, precision, F-measure and ROC curve which are calculated using the predictive classification table, known as Confusion matrix. Comparison of the performance of each algorithm made to select the algorithm with best performance. The results of the study show that ensemble J48 decision tree algorithm with 10-fold cross validation registered better performance of 96.73%. The boosting classifier provides highest prediction accuracy than the other classifiers. In this study, we found that the proposed ensemble methods provide significant improvement of prediction accuracy compared to individual classifiers.	en_US
dc.language.iso	en	en_US
dc.publisher	ST. MARY’S UNIVERSITY	en_US
dc.subject	Ensemble methods, Data mining, Boosting, Bagging, Stacking, Voting	en_US
dc.title	Telecom Voice Traffic Termination Fraud Detection Using Ensemble Learning: The Case of Ethio Telecom	en_US
dc.type	Thesis	en_US
Appears in Collections:	Master of computer science