Intonation modelling for the Nguni languages

Intonation modelling for the Nguni languages

Files

dissertation.pdf (614.01 KB)

Date

2006

Authors

Govender, Natasha

Publisher

University of Pretoria

Abstract

Although the complexity of prosody is widely recognised, there is a lack of widely-accepted descriptive standards for prosodic phenomena. This situation has become particularly noticeable with the development of increasingly capable text-to-speech (TTS) systems. Such systems require detailed prosodic models to sound natural. For the languages of Southern Africa, the deficiencies in our modelling capabilities are acute. Little work of a quantitative nature has been published for the languages of the Nguni family (such as isiZulu and isiXhosa), and there are significant contradictions and imprecisions in the literature on this topic. We have therefore embarked on a programme aimed at understanding the relationship between linguistic and physical variables of a prosodic nature in this family of languages. We then use the information/knowledge gathered to build intonation models for isiZulu and isiXhosa as representatives of the Nguni languages. Firstly, we need to extract physical measurements from the voice recordings of the Nguni family of languages. A number of pitch tracking algorithms have been developed; however, to our knowledge, these algorithms have not been evaluated formally on a Nguni language. In order to decide on an appropriate algorithm for further analysis, evaluations have been performed on two stateof- the-art algorithms namely the Praat pitch tracker and Yin (developed by Alain de Cheveingn´e). Praat’s pitch tracker algorithm performs somewhat better than Yin in terms of gross and fine errors and we use this algorithm for the rest of our analysis.<./p> For South African languages the task of building an intonation model is complicated by the lack of intonation resources available. We describe the methodology used for developing a generalpurpose intonation corpus and the various methods implemented to extract relevant features such as fundamental frequency, intensity and duration from the spoken utterances of these languages. In order to understand how the ‘expected’ intonation relates to the actual measured characteristics extracted, we developed two different statistical approaches to build intonation models for isiZulu and isiXhosa. The first is based on straightforward statistical techniques and the second uses a classifier. Both intonation models built produce fairly good accuracy for our isiZulu and isiXhosa sets of data. The neural network classifier used produces slightly better results for both sets of data than the statistical method. The classification model is also more robust and can easily learn from the training data. We show that it is possible to build fairly good intonation models for these languages using different approaches, and that intensity and fundamental frequency are comparable in predictive value for the ascribed tone.

Description

Dissertation (MSc (Computer Science))--University of Pretoria, 2006.

Keywords

Intonation corpus, Intensity, Intonation modelling, Pitch tracking, Autocorrelation, Classification, Tone, Fundamental frequency, Prosody, Nguni languages, UCTD

Citation

Govender, N 2006, Intonation modelling for the Nguni languages, MSc Dissertation, University of Pretoria, Pretoria, viewed yymmdd <http://hdl.handle.net/2263/28847>

URI

http://hdl.handle.net/2263/28847

Collections

Theses and Dissertations (University of Pretoria)
Theses and Dissertations (Computer Science)

Full item page

Intonation modelling for the Nguni languages

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections