Skip navigation
st. Mary's University Institutional Repository St. Mary's University Institutional Repository

Please use this identifier to cite or link to this item: http://hdl.handle.net/123456789/7875
Title: Sebat Bet Gurage (Chaha)-Amharic Machine Translation using Deep Learning
Authors: Yirga ,Dilu
Keywords: Natural Language Processing,Machine translation
Issue Date: Feb-2024
Publisher: St. Mary's University
Abstract: Natural Language Processing (NLP) is defined as a method for computers to intelligently analyze, understand, and derive meaning from human language. Machine translation is a branch of natural language processing that is used to translate text or speech from one language to another. Since before the thirteenth century, the sociolinguistic group of people living in the southwest of Ethiopia known as the administrative "Gurage Zone" has been referred to as "Gurage" (“ጉጉጉ” for the people and “ጉጉጉጉ” for the language). In this days with the advancement of technology there is the need to translate different official documents, news and other written texts in different languages. The Sebat Bet Gurage-Amharic language translation is one of the concern that needs such translation technologies. However there is no research conducted on machine translation between Sebat Bet Gurage particularly Chaha to Amharic. In this study, we have developed a Chaha-Amharic machine translation model using an encoder decoder machine translation approach. In the study we have collected 5200 Chaha-Amharic parallel sentences from different sources. We then perform cleaning, normalization and tokenization stages to preprocess the dataset. We have experimented an encoder decoder model using LSTM, Bi-LSTM and GRU deep learning algorithms. Based on the result of our experiments done in this study, the encoder decoder model using the Bi-LSTM algorithm has a better BLEU score. The encoder decoder model using the Bi-LSTM algorithm scored 22, the encoder decoder model using the LSTM algorithm scored 17 and the encoder decoder model using the GRU algorithm scored 20. From the experiment the encoder decoder model using the Bi-LSTM algorithm took a long training time of 1:30 hours.
URI: http://hdl.handle.net/123456789/7875
Appears in Collections:Master of computer science

Files in This Item:
File Description SizeFormat 
12. Dilu Yirga.pdf2.42 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.