Statistical Machine Translation Dayak Language – Indonesia Language

Muhammad Fiqri Khaikal, Arie Ardiyanti Suryani

Abstract


This Paper aims to discuss how to create the local language machine translation of Indonesia Language where the reason of local language selection was carried out as considering the using of machine translator for local language are still infrequently found mainly for Dayak Language machine translator.  Machine Translation on this research had used statistical approach where the resource data that was taken originated from articles on dayaknews.com pages with total parallel corpus was approximately 1000 Dayak Language – Indonesia Language furthermore as this research contains the corpus with total 1000 sentences accordingly divided into three sections in order to comprehend the certain analysis from a pattern that was created.  The monolingual corpus was collected approximately 1000 sentences of Indonesia Language.  The testing was carried out using Bilingual Evaluation Understudy (BLEU) tool and had result the highest accuracy value amounting to 49.15% which increase from some the others machine translator amounting to approximately 3%.

Keywords


Machine Translator;Statistical;Parallel Corpus;Monolingual Corpus;BLEU

Full Text:

PDF

References


Ansori, M. S. (2019). Sosiolingustik dalam kepunahan bahasa. An-Nuha, 6(1), 52–61.

Asparilla, M. G., Sujaini, H., & Nyoto, R. D. (2018). Perbaikan Kualitas Korpus untuk Meningkatkan Kualitas Mesin Penerjemah Statistik ( Studi Kasus : Bahasa Indonesia – Jawa Krama ). 1(2), 66–74.

Darwis, H. M. (2011). The Fate of Regional Languages in the Era of Globalization: Opportunities and Challenges. 1–13.

Dugonik, J., Bošković, B., Maučec, M. S., & Brest, J. (2015). The usage of differential evolution in a statistical machine translation. IEEE SSCI 2014 - 2014 IEEE Symposium Series on Computational Intelligence - SDE 2014: 2014 IEEE Symposium on Differential Evolution, Proceedings, December.

Hadi, I. (2014). Uji Akurasi Mesin Penerjemah Statistik (MPS) Bahasa Indonesia Ke Bahasa Melayu Sambas Dan Mesin Penerjemah Statistik (MPS) Bahasa Melayu Sambas Ke Bahasa Indonesia. Jurnal Sistem Dan Teknologi Informasi, 2, 1–6.

Mandira, S., Sujaini, H., & Putra, A. B. (2016). Perbaikan Probabilitas Lexical Model Untuk Meningkatkan Akurasi Mesin Penerjemah Statistik. Jurnal Edukasi Dan Penelitian Informatika (JEPIN), 2(1), 3–7. https://doi.org/10.26418/jp.v2i1.13393

Manual, U., & Guide, C. (2012). Statistical Machine Translation SystemUser Manual and Code Guide. University of Edinburgh, 1–267.

Mulyana, M., Sujaini, H., & Pratiwi, H. S. (2018). Algortima Pembagian Frasa Dalam Kalimat Untuk Menigkatkan Akurasi Mesin Penerjemah Statistik Bahasa Indonesia – Bahasa Bugis Wajo. Jurnal Sistem Dan Teknologi Informasi (JUSTIN), 6(2), 39.

Sujaini, H. (2017). Meningkatkan Peran Model Bahasa dalam Mesin Penerjemah Statistik (Studi Kasus Bahasa Indonesia-Dayak Kanayatn). Khazanah Informatika: Jurnal Ilmu Komputer Dan Informatika, 3(2), 51.

Wahyuni, M., Sujaini, H., & Muhardi, H. (2019). Pengaruh Kuantitas Korpus Monolingual Terhadap Akurasi Mesin Penerjemah Statistik. Jurnal Sistem Dan Teknologi Informasi (JUSTIN), 7(1), 20.

Wentzel, G. (1922). Funkenlinien im Röntgenspektrum. Annalen Der Physik, 371(23), 437–461.




DOI: http://dx.doi.org/10.30872/jim.v16i1.5315

Refbacks

  • There are currently no refbacks.


Copyright (c) 2021 Informatika Mulawarman : Jurnal Ilmiah Ilmu Komputer

Editor Informatika Mulawarman Address:
ISSN 1858-4853 (Print) | ISSN 2597-4963 (Online)

Published by: Mulawarman University
Managed by : Informatika Department
Jalan Sambaliung No.9 Sempaja Selatan Samarinda Utara,
Kalimantan Timur 75117
 - Indonesia
E-mail: jim.unmul@gmail.com
OJS: http://e-journals.unmul.ac.id/index.php/JIM
Contact Person: Gubtha Mahendra Putra

 Creative Commons License

Informatika Mulawarman by http://e-journals.unmul.ac.id/index.php/JIM/index is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Under the CC BY-SA license, authors and other users are able to reprint, distribute or use the material for commercial purposes so long as they give attribution to the journal Informatika Mulawarman and license the republished material under the same license.