USE OF TECHNOLOGY TO IMPROVE
SOCIAL COHESION
OUSL
ANNUAL ACADEMIC SESSIONS – 2013
Department of Electrical and Engineering, Faculty of
Engineering Technology,
The Open University of Sri Lanka
INTRODUCTION
Linguistic pluralism is an advantage for and lays
the foundation for social cohesion, which is characterized by inclusivity,
respect for all dialogue between all members of society [6]. Linguistic pluralism is just one of
several strategies that are required for social cohesion.
Therefore technology can be used to overcome the language barrier with speech
technology. In Sri Lanka the Sinhalese, make up about 75% of the population
[according to 2012 census]. A majority of the 1/4 the population
who are either Muslims or Tamils may not be able to read Sinhala but have the
ability to understand spoken Sinhala.
Also there are a considerable number of
people who are either visually impaired or totally blind (285 million
people are visually impaired worldwide: 39 million are blind and 246 have low
vision) [5]. Such people find it
difficult to benefit from the information sources widely available on the
Internet. Western countries and other more technologically advanced countries
have tried to remedy this situation by building screen readers that would read
out a selected piece of text. However, there are no such portability systems
capable of converting text written in Sinhalese language to voice through the
web Browser. This project is an attempt to build an open source text-to-speech
(TTS) system that will be widely available, platform independent and easy to
use. The solution will be implemented as a browser plug-in making it usable by
anybody who has access to a browser.
BACKGROUND
The conversion of text to speech,
i.e. the process of automatic generation of speech output from computer
readable text, is called speech synthesis. TTS systems have been developed for
many languages, with majority of them working with the English language [2].
Sinhalese is a language used by a very limited population of the world. There
are only a few attempts that tried building TTS systems especially for
Sinhalese language [1].Those that are implemented function as stand-alone
applications that require the user to follow a complex installation procedure.
In contrast, this is the first known documented work on a Web Based Sinhala TTS
application that has been developed as a plug-in for the Mozilla Firefox
Internet browser. This application uses Festival Framework Based Sinhala TTS
System [1][2][3] .The Festival Speech Synthesis
System is an open source, stable and portable multilingual speech synthesis
framework developed at the Center for Speech Technology Research (CSTR) of the
University of Edinburgh [3]. This framework is considered the most suitable for
Sinhala language by the local research community involved in language
processing [2].
METHODOLOGY
Figure 1 :Overall
concept view of the system .
Diagram
in Figure 1 shows the overall system. The user selects the text to listen to
from the browser and activates the extension. The extension converts the
selected text to speech and plays the output on the computers speakers or other
sound producing device. The core of the system consists of Sinhala TTS engine
which will be used to generate speech synthesis for Sinhala, it takes phones
dictionary from linguistic analyzer and normalised Sinhala Unicode text from
text analysis as input and match the letter to sound rules and letter to phone
prediction which results
in desired phone sequences. Then using the Unit Selection Algorithm,
appropriate sound segments for the phone sequences are generated with the help
of Speech Database. Finally the Waveform Synthesis mechanism outputs the sound
relevant to the wordings. One of major parts is the system designing a speech
data base. When designing speech database, prerecorded female voice for is used
for speech output.
DESCRIPTION OF TECHNOLOGIES USED
Figure2
Firefox Speech Extension Data Flow Diagram
We describe implementation and evaluation of the
Sinhala TTS browser extension based on festival framework. We have used two
core components of the Java Speech API, i.e., speech recognition and speech
synthesis [3] as the foundation. We found that the Java platform to be the best
option to design a speech application given its portability, platform
independence and support provided by all major web browsers. Other technologies
used in this project include Live Connect [4], Speech Synthesizer [1][3],
Chrome [4], and XPCOM (Cross Platform Component Object module) [4].The Chrome
Component located in the Browser extension is able to add features to the
Browser and XUL files these are contained in the Chrome Component and helps to
design extension’s user interface.
The XUL file in with the XUL file in the browser, hence the extension
adds additional functionalities to the browser. XPCOM is a framework that
allows different pieces of the software that are to be developed independently.
It helps integrate JavaScript and the Java component of the software. Live Connect is a feature of Web browsers that allows Java and JavaScript
software to intercommunicate within a Web page. From the Java side it allows an applet to invoke the embedded scripts of
a page or to access the built-in JavaScript environment. Conversely, from the
JavaScript side, it allows a script to invoke applet methods, or to access the
Java runtime libraries [4].
DISCUSSION
AND FUTURE WORK
This system is a browser plug-in
that can be installed on any browser supporting Java language. This system is
attached to the browser as a plug-in and is currently used with the mouse
interface. Currently the application is implemented in Firefox. This application
can be extend as a another browser application such as mobile browser (Opera
mini, Android Browser, Dolphin Browser like that).Also the project can
be carried further by generalizing the tool so that the application will be
able to function on voice commands thus maximizing the user’s interaction with
the computer.
REFERENCES
1.Prof
Gihan Dias,Dr.Sanath Jayasena. (2010). Computer
text to Speech System with pharse breaking Algorithm(AMoRA). University of Moratuwa,Department of Computer Science and
Engineering,
2.Dr.Ruwan
Weerasinghe,Asanka wasala,Viraj Welgama,Kumudu Gamage. (2007). Festival-si:A Sinhala Text to Speech System.
IEEE Language Technology reasearch Labotratory
University of Colombo,School of Computing
3.Pawl
Taylor,Alan W Black,Richard Caley (1987) The Architecture of the Festival Speech Synthesis System,University Of Edinburgh
Center for Speech Tecnology Reasearch.
4.Mozilla
Development Network and Individual Contributors(MDN)(2005-2013)
5.World Health Organization.(2012). Visual impairment and blindness
6.Sarifa Moola, Language pluralism and social cohesion. The Shared
Societies Project.
[1] All correspondence
should be addressed to L.A.M.P Nimalaratne, Dept. Of Electrical and Computer Engineering, Faculty
of Engineering Technology, Open University of Sri Lanka
(email:lampnima@gmail.com)