Repository logo
Communities & Collections
All of DSpace
  • English
  • العربية
  • বাংলা
  • Català
  • Čeština
  • Deutsch
  • Ελληνικά
  • Español
  • Suomi
  • Français
  • Gàidhlig
  • हिंदी
  • Magyar
  • Italiano
  • Қазақ
  • Latviešu
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Српски
  • Svenska
  • Türkçe
  • Yкраї́нська
  • Tiếng Việt
Log In
New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "Weerasinghe, A.R."

Filter results by typing the first few letters
Now showing 1 - 4 of 4
  • Results Per Page
  • Sort Options
  • No Thumbnail Available
    Item
    A Rule Based Syllabification Algorithm for Sinhala
    (Lecture Notes in Computer Science/ IJCNLP: International Joint Conference on Natural Language Processing, 2005) Weerasinghe, A.R.; Wasala, R.A.; Gamage, K.N.
    This paper presents a study of Sinhala syllable structure and an algorithm for identifying syllables in Sinhala words. After a thorough study of the Syllable structure and linguistic rules for syllabification of Sinhala words and a survey of the relevant literature, a set of rules was identified and implemented as a simple, easy-to-implement algorithm. The algorithm was tested using 30,000 distinct words obtained from a corpus and compared with the same words manually syllabified. The algorithm performs with 99.95 % accuracy.
  • No Thumbnail Available
    Item
    Festival-si: A Sinhala Text-to-Speech System
    (Lecture Notes in Computer Science/ IJCNLP: International Joint Conference on Natural Language Processing, 2007) Weerasinghe, A.R.; Wasala, R.A.; Gamage, K.N.
    This paper brings together the development of the first Text-to-Speech (TTS) system for Sinhala using the Festival framework and practical applications of it. Construction of a diphone database and implementation of the natural language processing modules are described. The paper also presents the development methodology of direct Sinhala Unicode text input by rewriting Letter-to-Sound rules in Festival's context sensitive rule format and the implementation of Sinhala syllabification algorithm. A Modified Rhyme Test (MRT) was conducted to evaluate the intelligibility of the synthesized speech and yielded a score of 71.5% for the TTS system described.
  • No Thumbnail Available
    Item
    Sinhala Grapheme-to-Phoneme Conversion and Rules for Schwa Epenthesis
    (Proceedings of the COLING/ACL Main Conference Poster Sessions, Association for Computational Linguistics, 2006) Weerasinghe, A.R.; Wasala, R.A.; Gamage, K.N.
    This paper describes an architecture to convert Sinhala Unicode text into phonemic specification of pronunciation. The study was mainly focused on disambiguating schwa-/\/ and /a/ vowel epenthesis for consonants, which is one of the significant problems found in Sinhala. This problem has been addressed by formulating a set of rules. The proposed set of rules was tested using 30,000 distinct words obtained from a corpus and com-pared with the same words manually transcribed to phonemes by an expert. The Grapheme-to-Phoneme (G2P) con-version model achieves 98 % accuracy.
  • No Thumbnail Available
    Item
    The Sinhala Collation Sequence and its Representation in UNICODE
    (Localisation Focus: The International Journal for Localisation, 2005) Weerasinghe, A.R.; Herath, D.L.; Gamage, K.N.
    The alphabet of a language is perhaps the first thing we learn as users. The alphabet of our mother tongue would be the first alphabet we ever learn. And yet, a closer look reveals that there is much about such an alphabet that we have not explicitly specified anywhere. The Sinhala alphabet order is a prime example. We use it, recite it and yet would be hard pressed to define it explicitly. Sinhala is spoken in all parts of Sri Lanka except some districts in the north, east and centre by approximately 20 million people. It is spoken by an additional 30,000 (1993) people in Canada, Maldives, Singapore, Thailand and United Arab Emirates. Sinhala is classified as an Indo-European language and used as an official language. The UNICODE Collation Algorithm (UCA) is an attempt to make explicit the collation sequence of any language expressed in the UNICODE (or any other) coding system. In order to express the Sinhala collation sequence (alphabetical order) using UCA, the authors undertook the task of identifying unresolved issues facing the unambiguous definition of the order. This paper first describes the issues identified through this study, suggesting alternate solutions and recommending one of them. Finally, it sets out the recommended collation sequence for Sinhala in the form of the UNICODE collation specification. The outcome of this process is a unique and unambiguous expression of the Sinhala collation sequence which could be tested using existing tools and software environments.

DSpace software copyright © 2002-2025 LYRASIS

  • Privacy policy
  • End User Agreement
  • Send Feedback
Repository logo COAR Notify