Designing an Automatic Speech Recognition System to recognize frequently used sentences in Sinhala

Samankula, W.G.D.M.; Dias, N.G.J.

Please use this identifier to cite or link to this item: http://repository.kln.ac.lk/handle/123456789/8567

Title:	Designing an Automatic Speech Recognition System to recognize frequently used sentences in Sinhala
Authors:	Samankula, W.G.D.M. Dias, N.G.J.
Keywords:	Visually Impaired, Speaker Independent, Speech Recognition System
Issue Date:	2013
Publisher:	University of Kelaniya
Citation:	Samankula, W.G.D.M. and Dias, N.G.J., 2013. Designing an Automatic Speech Recognition System to recognize frequently used sentences in Sinhala, Proceedings of the Annual Research Symposium 2013, Faculty of Graduate Studies, University of Kelaniya, pp 62.
Abstract:	There are millions of people with visual impairments as well as motor impairments caused by old age, sickness or accidents. These people have to face a lot of challenges in their day to day lives. Even at home, if these people want to do a simple task such as control the radio, refrigerator, or fan, it becomes a difficult task because they have to use a white cane or wheel chair to move or get assistance from others. The aim of this research is to develop a speaker independent continuous speech recognition system which is embedded with the capability of understanding human speech in Sinhala language rather than foreign languages because the majority of people in Sri Lanka speak Sinhalese. In order to achieve this goal, human speech signals have to be recognized and converted into effective commands to operate equipment. The Hidden Markov Model Toolkit (HTK) based on Hidden Markov Model (HMM), a statistical approach, is used to develop the system. HTK is used for data preparation, training, testing and analysis phases of the recognition process. Twenty five sentences consisting of 2, 3 or 4 words in Sinhala which are frequently used in day to day activities at home were prepared. Recording process has been done with 10 native speakers (5 females and 5 males) in a quiet environment. Eight hundred speech samples have been collected for training from 4 males and 4 females by speaking each sentence 4 times. The experimental results show 94.00% sentence level accuracy and a 97.85% word level accuracy using a mono-phone based acoustic model and, also a 99.00% sentence level accuracy and a 99.69% word level accuracy using a tri-phone based acoustic model.
URI:	http://repository.kln.ac.lk/handle/123456789/8567
Appears in Collections:	ARS - 2013

Files in This Item:

File	Description	Size	Format
Samankula, W.G.D.M..pdf		244.52 kB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets