Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis
No Thumbnail Available
Date
2006
Journal Title
Journal ISSN
Volume Title
Publisher
Proceedings of the COLING/ACL Main Conference Poster Sessions, Association for Computational Linguistics
Abstract
This paper describes an architecture to convert
Sinhala Unicode text into phonemic specification of
pronunciation. The study was mainly focused on
disambiguating schwa-/\/ and /a/ vowel epenthesis for
consonants, which is one of the significant problems
found in Sinhala. This problem has been addressed by
formulating a set of rules. The proposed set of rules
was tested using 30,000 distinct words obtained from
a corpus and com-pared with the same words
manually transcribed to phonemes by an expert. The
Grapheme-to-Phoneme (G2P) con-version model
achieves 98 % accuracy.
Description
Keywords
Sinhalese language, Sri Lanka, Graphemics, Transmutation (Linguistics)