dc.contributor.author |
Zietsman, Grant
|
|
dc.contributor.author |
Malekian, Reza
|
|
dc.date.accessioned |
2023-02-22T13:50:01Z |
|
dc.date.available |
2023-02-22T13:50:01Z |
|
dc.date.issued |
2022-12 |
|
dc.description.abstract |
Accent invariance in speech recognition is a challenging problem especially in the are of aviation. In this paper a speech recognition system is developed to transcribe accented speech between pilots and air traffic controllers. The system allows handling of accents in continuous speech by modelling phonemes using Hidden Markov Models (HMMs) with Gaussian mixture model (GMM) probability density functions for each state. These phonemes are used to build word models of the NATO phonetic alphabet as well as the numerals 0 to 9 with transcriptions obtained from the Carnegie Mellon University (CMU) pronouncing dictionary. Mel-Frequency Cepstral Co-efficients (MFCC) with delta and delta-delta coefficients are used for the feature extraction process. Amplitude normalisation and covariance scaling is implemented to improve recognition accuracy. A word error rate (WER) of 2% for seen speakers and 22% for unseen speakers is obtained. |
en_US |
dc.description.department |
Electrical, Electronic and Computer Engineering |
en_US |
dc.description.librarian |
hj2023 |
en_US |
dc.description.uri |
http://jit.ndhu.edu.tw |
en_US |
dc.identifier.citation |
Zietsman, G. & Malekian, R. 2022, 'Modelling of a speech-to-text recognition system for air traffic control and NATO air command', Journal of Internet Technology, vol. 23, no. 7, pp. 1527-1539, doi : 10.53106/160792642022122307008. |
en_US |
dc.identifier.issn |
1607-9264 (print) |
|
dc.identifier.issn |
2079-4029 (online) |
|
dc.identifier.other |
10.53106/160792642022122307008 |
|
dc.identifier.uri |
https://repository.up.ac.za/handle/2263/89771 |
|
dc.language.iso |
en |
en_US |
dc.publisher |
Taiwan Academic Network Management Committee |
en_US |
dc.rights |
Taiwan Academic Network Management Committee |
en_US |
dc.subject |
Automatic speech recognition (ASR) |
en_US |
dc.subject |
Hidden Markov model (HMM) |
en_US |
dc.subject |
Gaussian mixture model (GMM) |
en_US |
dc.subject |
Mel-frequency cepstral coefficients (MFCC) |
en_US |
dc.subject |
Covariance scaling |
en_US |
dc.title |
Modelling of a speech-to-text recognition system for air traffic control and NATO air command |
en_US |
dc.type |
Article |
en_US |