Friday, March 14, 2014

USE OF TECHNOLOGY TO IMPROVE SOCIAL COHESION



USE OF TECHNOLOGY TO IMPROVE SOCIAL COHESION
                       OUSL ANNUAL ACADEMIC SESSIONS – 2013

        L.A.M.P Nimalaratne[1], A.P Madurapperuma, K.A.R.D Gunaratne
Department of Electrical and Engineering, Faculty of Engineering Technology,
The Open University of Sri Lanka

INTRODUCTION

Linguistic pluralism is an advantage for and lays the foundation for social cohesion, which is characterized by inclusivity, respect for all dialogue between all members of society [6]. Linguistic pluralism is just one of several strategies that are required for social cohesion. Therefore technology can be used to overcome the language barrier with speech technology. In Sri Lanka the Sinhalese, make up about 75% of the population [according to 2012 census]. A majority of the 1/4 the population who are either Muslims or Tamils may not be able to read Sinhala but have the ability to understand spoken Sinhala. Also there are a considerable number of people who are either visually impaired or totally blind (285 million people are visually impaired worldwide: 39 million are blind and 246 have low vision) [5]. Such people find it difficult to benefit from the information sources widely available on the Internet. Western countries and other more technologically advanced countries have tried to remedy this situation by building screen readers that would read out a selected piece of text. However, there are no such portability systems capable of converting text written in Sinhalese language to voice through the web Browser. This project is an attempt to build an open source text-to-speech (TTS) system that will be widely available, platform independent and easy to use. The solution will be implemented as a browser plug-in making it usable by anybody who has access to a browser.



BACKGROUND

The conversion of text to speech, i.e. the process of automatic generation of speech output from computer readable text, is called speech synthesis. TTS systems have been developed for many languages, with majority of them working with the English language [2]. Sinhalese is a language used by a very limited population of the world. There are only a few attempts that tried building TTS systems especially for Sinhalese language [1].Those that are implemented function as stand-alone applications that require the user to follow a complex installation procedure. In contrast, this is the first known documented work on a Web Based Sinhala TTS application that has been developed as a plug-in for the Mozilla Firefox Internet browser. This application uses Festival Framework Based Sinhala TTS System [1][2][3] .The Festival Speech Synthesis System is an open source, stable and portable multilingual speech synthesis framework developed at the Center for Speech Technology Research (CSTR) of the University of Edinburgh [3]. This framework is considered the most suitable for Sinhala language by the local research community involved in language processing [2].
METHODOLOGY

Figure 1 :Overall concept view of the system .
Diagram in Figure 1 shows the overall system. The user selects the text to listen to from the browser and activates the extension. The extension converts the selected text to speech and plays the output on the computers speakers or other sound producing device. The core of the system consists of Sinhala TTS engine which will be used to generate speech synthesis for Sinhala, it takes phones dictionary from linguistic analyzer and normalised Sinhala Unicode text from text analysis as input and match the letter to sound rules and letter to phone prediction which results in desired phone sequences. Then using the Unit Selection Algorithm, appropriate sound segments for the phone sequences are generated with the help of Speech Database. Finally the Waveform Synthesis mechanism outputs the sound relevant to the wordings. One of major parts is the system designing a speech data base. When designing speech database, prerecorded female voice for is used for speech output.



DESCRIPTION OF TECHNOLOGIES USED

Figure2 Firefox Speech Extension Data Flow Diagram

We describe implementation and evaluation of the Sinhala TTS browser extension based on festival framework. We have used two core components of the Java Speech API, i.e., speech recognition and speech synthesis [3] as the foundation. We found that the Java platform to be the best option to design a speech application given its portability, platform independence and support provided by all major web browsers. Other technologies used in this project include Live Connect [4], Speech Synthesizer [1][3], Chrome [4], and XPCOM (Cross Platform Component Object module) [4].The Chrome Component located in the Browser extension is able to add features to the Browser and XUL files these are contained in the Chrome Component and helps to design extension’s user interface. The XUL file in with the XUL file in the browser, hence the extension adds additional functionalities to the browser. XPCOM is a framework that allows different pieces of the software that are to be developed independently. It helps integrate JavaScript and the Java component of the software. Live Connect is a feature of Web browsers that allows Java and JavaScript software to intercommunicate within a Web page. From the Java side it allows an applet to invoke the embedded scripts of a page or to access the built-in JavaScript environment. Conversely, from the JavaScript side, it allows a script to invoke applet methods, or to access the Java runtime libraries [4].
          


DISCUSSION AND FUTURE WORK
This system is a browser plug-in that can be installed on any browser supporting Java language. This system is attached to the browser as a plug-in and is currently used with the mouse interface. Currently the application is implemented in Firefox. This application can be extend as a another browser application such as mobile browser (Opera mini, Android Browser, Dolphin Browser like that).Also the project can be carried further by generalizing the tool so that the application will be able to function on voice commands thus maximizing the user’s interaction with the computer.


REFERENCES

1.Prof Gihan Dias,Dr.Sanath Jayasena. (2010). Computer text to Speech System with pharse breaking Algorithm(AMoRA). University of Moratuwa,Department of Computer Science and Engineering,
2.Dr.Ruwan Weerasinghe,Asanka wasala,Viraj Welgama,Kumudu Gamage. (2007). Festival-si:A Sinhala Text to Speech System. IEEE Language Technology reasearch Labotratory University of Colombo,School of Computing
3.Pawl Taylor,Alan W Black,Richard Caley (1987) The Architecture of the Festival Speech  Synthesis System,University Of Edinburgh Center for Speech Tecnology Reasearch.

4.Mozilla Development Network and Individual Contributors(MDN)(2005-2013)

5.World Health Organization.(2012). Visual impairment and blindness


6.Sarifa Moola, Language pluralism and social cohesion. The Shared Societies Project.









[1] All correspondence should be addressed to L.A.M.P Nimalaratne, Dept. Of  Electrical and Computer Engineering, Faculty of Engineering Technology, Open University of Sri Lanka (email:lampnima@gmail.com)