Improving transliteration system from Nôm scripts into Vietnamese national scripts using language model
Improving transliteration system from Nôm scripts into Vietnamese national scripts using language model
This paper proposes a method to improve a currently-in-use transliteration system called Nôm Converter. By building language models for specific literary forms and domains, together with a larger corpus, we can build a transliteration system that outperforms Nôm Converter. BLEU scores of our system are 82.80 and 89.72, while results on the same test sets of Nôm Converter are 56.84 and 50.95.