TY - GEN
T1 - LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages
AU - Coleman, Jared
AU - Krishnamachari, Bhaskar
AU - Iskarous, Khalil
AU - Rosales, Ruben
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - We propose a new paradigm for machine translation that is particularly useful for no-resource languages (those without any publicly available bilingual or monolingual corpora): LLM-RBMT (LLM-Assisted Rule Based Machine Translation). Using the LLM-RBMT paradigm, we design the first language education/revitalization-oriented machine translator for Owens Valley Paiute (OVP), a critically endangered Indigenous American language for which there is virtually no publicly available data. We present a detailed evaluation of the translator's components: a rule-based sentence builder, an OVP to English translator, and an English to OVP translator. We also discuss the potential of the paradigm, its limitations, and the many avenues for future research that it opens up.
AB - We propose a new paradigm for machine translation that is particularly useful for no-resource languages (those without any publicly available bilingual or monolingual corpora): LLM-RBMT (LLM-Assisted Rule Based Machine Translation). Using the LLM-RBMT paradigm, we design the first language education/revitalization-oriented machine translator for Owens Valley Paiute (OVP), a critically endangered Indigenous American language for which there is virtually no publicly available data. We present a detailed evaluation of the translator's components: a rule-based sentence builder, an OVP to English translator, and an English to OVP translator. We also discuss the potential of the paradigm, its limitations, and the many avenues for future research that it opens up.
KW - cs.CL
UR - https://www.scopus.com/pages/publications/85216924636
UR - https://www.scopus.com/pages/publications/85216924636#tab=citedBy
U2 - 10.18653/v1/2024.americasnlp-1.9
DO - 10.18653/v1/2024.americasnlp-1.9
M3 - Conference contribution
T3 - AmericasNLP 2024 - 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas - Proceedings of the Workshop
SP - 67
EP - 87
BT - AmericasNLP 2024 - 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas - Proceedings of the Workshop
A2 - Mager, Manuel
A2 - Ebrahimi, Abteen
A2 - Rijhwani, Shruti
A2 - Oncevay, Arturo
A2 - Chiruzzo, Luis
A2 - Pugh, Robert
A2 - von der Wense, Katharina
A2 - von der Wense, Katharina
PB - Association for Computational Linguistics (ACL)
T2 - 4th Workshop on Natural Language Processing for Indigenous Languages of the Americas, AmericasNLP 2024
Y2 - 21 June 2024
ER -