Sparse coding for speech recognition

Sparse coding for speech recognition

dc.contributor.advisor	Barnard, E.	en
dc.contributor.email	willie.smit@gmail.com	en
dc.contributor.postgraduate	Smit, Willem Jacobus	en
dc.date.accessioned	2013-09-07T15:36:20Z
dc.date.available	2008-12-11	en
dc.date.available	2013-09-07T15:36:20Z
dc.date.created	2008-09-02	en
dc.date.issued	2008-12-11	en
dc.date.submitted	2008-11-11	en
dc.description	Thesis (PhD)--University of Pretoria, 2008.	en
dc.description.abstract	The brain is a complex organ that is computationally strong. Recent research in the field of neurobiology help scientists to better understand the working of the brain, especially how the brain represents or codes external signals. The research shows that the neural code is sparse. A sparse code is a code in which few neurons participate in the representation of a signal. Neurons communicate with each other by sending pulses or spikes at certain times. The spikes send between several neurons over time is called a spike train. A spike train contains all the important information about the signal that it codes. This thesis shows how sparse coding can be used to do speech recognition. The recognition process consists of three parts. First the speech signal is transformed into a spectrogram. Thereafter a sparse code to represent the spectrogram is found. The spectrogram serves as the input to a linear generative model. The output of themodel is a sparse code that can be interpreted as a spike train. Lastly a spike train model recognises the words that are encoded in the spike train. The algorithms that search for sparse codes to represent signals require many computations. We therefore propose an algorithm that is more efficient than current algorithms. The algorithm makes it possible to find sparse codes in reasonable time if the spectrogram is fairly coarse. The system achieves a word error rate of 19% with a coarse spectrogram, while a system based on Hidden Markov Models achieves a word error rate of 15% on the same spectrograms.	en
dc.description.availability	unrestricted	en
dc.description.department	Electrical, Electronic and Computer Engineering	en
dc.identifier.citation	a 2008	en
dc.identifier.other	D535/gm	en
dc.identifier.upetdurl	http://upetd.up.ac.za/thesis/available/etd-11112008-151309/	en
dc.identifier.uri	http://hdl.handle.net/2263/29409
dc.language.iso		en
dc.publisher	University of Pretoria	en_ZA
dc.rights	© University of Pretoria 2008 D535/	en
dc.subject	Mathematical optimization	en
dc.subject	Spike train classification	en
dc.subject	Spike train	en
dc.subject	Speech recognition	en
dc.subject	Sparse code	en
dc.subject	Linear generative model	en
dc.subject	Sparse code measurement	en
dc.subject	Dictionary training	en
dc.subject	Overcomplete dictionary	en
dc.subject	Spectrogram	en
dc.subject	UCTD	en_US
dc.title	Sparse coding for speech recognition	en
dc.type	Thesis	en