Cross-language acoustic adaptation for automatic speech recognition

Please be advised that the site will be down for maintenance on Sunday, September 1, 2024, from 08:00 to 18:00, and again on Monday, September 2, 2024, from 08:00 to 09:00. We apologize for any inconvenience this may cause.

Show simple item record

dc.contributor.advisor Botha, Elizabeth C. en
dc.contributor.postgraduate Nieuwoudt, Christoph en
dc.date.accessioned 2013-09-07T09:38:51Z
dc.date.available 2005-01-06 en
dc.date.available 2013-09-07T09:38:51Z
dc.date.created 2000-04-08 en
dc.date.issued 2006-01-06 en
dc.date.submitted 2005-01-06 en
dc.description Thesis (PhD (Electrical, Electronic and Computer Engineering))--University of Pretoria, 2006. en
dc.description.abstract Speech recognition systems have been developed for the major languages of the world, yet for the majority of languages there are currently no large vocabulary continuous speech recognition (LVCSR) systems. The development of an LVCSR system for a new language is very costly, mainly because a large speech database has to be compiled to robustly capture the acoustic characteristics of the new language. This thesis investigates techniques that enable the re-use of acoustic information from a source language, in which a large amount of data is available, in implementing a system for a new target language. The assumption is that too little data is available in the target language to train a robust speech recognition system on that data alone, and that use of acoustic information from a source language can improve the performance of a target language recognition system. Strategies for cross-language use of acoustic information are proposed, including training on pooled source and target language data, adaptation of source language models using target language data, adapting multilingual models using target language data and transforming source language data to augment target language data for model training. These strategies are allied with Bayesian and transformation-based techniques, usually used for speaker adaptation, as well as with discriminative learning techniques, to present a framework for cross-language re-use of acoustic information. Extensions to current adaptation techniques are proposed to improve the performance of these techniques specifically for cross-language adaptation. A new technique for transformation-based adaptation of variance parameters and a cost-based extension of the minimum classification error (MCE) approach are proposed. Experiments are performed for a large number of approaches from the proposed framework for cross-language re-use of acoustic information. Relatively large amounts of English speech data are used in conjunction with smaller amounts of Afrikaans speech data to improve the performance of an Afrikaans speech recogniser. Results indicate that a significant reduction in word error rate (between 26% and 50%, depending on the amount of Afrikaans data available) is possible when English acoustic data is used in addition to Afrikaans speech data from the same database (i.e both sets of data were recorded under the same c`12onditions and the same labelling process was used). For same-database experiments, best results are achieved for approaches that train models on pooled source and target language data and then perform further adaptation of the models using Bayesian or discriminative techniques on target language data only. Experiments are also performed to evaluate the use of English data from a different database than the Afrikaans data. Peak reductions in word error rate of between 16% and 35% are delivered, depending on the amount of Afrikaans data available. Best results are achieved for an approach that performs a simple transformation of source model parameters using target language data, and then performs Bayesian adaptation of the transformed model on target language data. en
dc.description.availability unrestricted en
dc.description.department Electrical, Electronic and Computer Engineering en
dc.identifier.citation Nieuwoudt, C 2000, Cross-language acoustic adaptation for automatic speech recognition, PhD thesis, University of Pretoria, Pretoria, viewed yymmdd < http://hdl.handle.net/2263/26974 > en
dc.identifier.upetdurl http://upetd.up.ac.za/thesis/available/etd-01062005-071829/ en
dc.identifier.uri http://hdl.handle.net/2263/26974
dc.language.iso en
dc.publisher University of Pretoria en_ZA
dc.rights © 2000, University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. en
dc.subject Minimum classification error adaptation en
dc.subject Transformation-based adaptation en
dc.subject Bayesian adaptation en
dc.subject Cross-language acoustic adaptation en
dc.subject Multilingual speech recognition en
dc.subject UCTD en_US
dc.title Cross-language acoustic adaptation for automatic speech recognition en
dc.type Thesis en


Files in this item

This item appears in the following Collection(s)

Show simple item record