Digital Repository

The Sinhala Collation Sequence and its Representation in UNICODE

Show simple item record Weerasinghe, A.R. en_US Herath, D.L. en_US Gamage, K.N. en_US 2014-11-13T08:52:51Z 2014-11-13T08:52:51Z 2005
dc.description.abstract The alphabet of a language is perhaps the first thing we learn as users. The alphabet of our mother tongue would be the first alphabet we ever learn. And yet, a closer look reveals that there is much about such an alphabet that we have not explicitly specified anywhere. The Sinhala alphabet order is a prime example. We use it, recite it and yet would be hard pressed to define it explicitly. Sinhala is spoken in all parts of Sri Lanka except some districts in the north, east and centre by approximately 20 million people. It is spoken by an additional 30,000 (1993) people in Canada, Maldives, Singapore, Thailand and United Arab Emirates. Sinhala is classified as an Indo-European language and used as an official language. The UNICODE Collation Algorithm (UCA) is an attempt to make explicit the collation sequence of any language expressed in the UNICODE (or any other) coding system. In order to express the Sinhala collation sequence (alphabetical order) using UCA, the authors undertook the task of identifying unresolved issues facing the unambiguous definition of the order. This paper first describes the issues identified through this study, suggesting alternate solutions and recommending one of them. Finally, it sets out the recommended collation sequence for Sinhala in the form of the UNICODE collation specification. The outcome of this process is a unique and unambiguous expression of the Sinhala collation sequence which could be tested using existing tools and software environments. en_US
dc.publisher Localisation Focus: The International Journal for Localisation en_US
dc.subject Sinhalese language, Sri Lanka
dc.subject Algorithms
dc.subject Alphabet
dc.subject Sequence (Linguistics)
dc.title The Sinhala Collation Sequence and its Representation in UNICODE
dc.type Article en_US
dc.identifier.department Linguistics en_US

Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


My Account