Bridging Linguistic Gaps: A Review of AI- Driven Speech-to-Speech Translation for Sinhala and Tamil in Sri Lanka

dc.contributor.authorDilshani, I.
dc.contributor.authorChandrasena, M.
dc.date.accessioned2025-09-25T07:57:16Z
dc.date.issued2025
dc.description.abstractSri Lanka maintains two official languages which constitute Sinhalese and Tamil making it linguistically diverse. Almost all communication requires effective dialogue between Sinhalese-speaking and Tamil-speaking communities especially when operating through real-time speech-to-speech translation. The current version of Speech-to-Speech Translation (S2S) solutions serves a useful purpose yet faces three major limitations, including internet dependence, performance difficulties in loud environments, and unnatural Text-to-Speech (TTS) outputs. The Automatic Speech Recognition (ASR) systems from CallTran along with Android-based solutions through Google APIs and PocketSphinx, struggle with flexible operations when processing different accent varieties. Furthermore, the Machine Translation (MT) system performs poorly in achieving semantic relevance due to the scarcity of parallel corpora. The combination of ASR, MT and TTS systems produces performance delays and misinterpretation issues, which interfere with real-time functionality. This review examines current models, highlights theoretical and practical gaps, and proposes directions for future research, followed by a comparison of existing approaches. The research requires attention to three essential gaps, including bilingual dataset annotation tasks alongside offline functionality and natural voice synthesis development. We propose future research directions to establish massive bilingual datasets as well as implement noise-resistant ASR models using self-supervised approaches such as Whisper and Wave-to-Vector v2 (Wav2Vec2) and the fine-tuning of multilingual MT models like Multilingual Bidirectional and Auto-Regressive Transformer (mBART) for Low Resource Sinhala-Tamil Language translation systems. Additionally, TTS models like Tacotron, FastSpeech and Coqui TTS should be optimized for prosody and intonation.
dc.identifier.citationDilshani, I., & Chandrasena, M. (2025). Bridging linguistic gaps: A review of AI-driven speech-to-speech translation for Sinhala and Tamil in Sri Lanka. In Proceedings of the International Research Conference on Smart Computing and Systems Engineering (SCSE 2025). Department of Industrial Management, Faculty of Science, University of Kelaniya.
dc.identifier.urihttp://repository.kln.ac.lk/handle/123456789/30046
dc.publisherDepartment of Industrial Management, Faculty of Science, University of Kelaniya.
dc.subjectautomatic speech recognition
dc.subjectlow resource speech-to-speech translation
dc.subjectmachine translation
dc.subjectSinhala-Tamil translation
dc.subjecttext-to-speech
dc.titleBridging Linguistic Gaps: A Review of AI- Driven Speech-to-Speech Translation for Sinhala and Tamil in Sri Lanka
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
SCSE Abstract Proceedings 2025-33.pdf
Size:
10.56 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: