Speech recognition system using MATLAB : design, implementation, and samples codes

Research in automatic speech recognition has been done for almost four decades. Over the past decades, the development of speech recognition applications gives invaluable contributions. Speech has the potential to be a better interface than other computing devices used such as keyboard or mouse. Thi...

Full description

Bibliographic Details
Main Authors: Abushariah, Ahmad A. M., Gunawan, Teddy Surya
Format: Book
Language:English
Published: Lambert Academic Publishing 2011
Subjects:
Online Access:http://irep.iium.edu.my/27200/
http://irep.iium.edu.my/27200/
http://irep.iium.edu.my/27200/1/Speech_Recognition.pdf
Description
Summary:Research in automatic speech recognition has been done for almost four decades. Over the past decades, the development of speech recognition applications gives invaluable contributions. Speech has the potential to be a better interface than other computing devices used such as keyboard or mouse. This project aims to develop automated English digits speech recognition system. The project relies heavily on the well known and widely used statistical method in characterizing the speech pattern, the Hidden Markov Model (HMM), which provides a highly reliable way for recognizing speech. This project discusses the theory of HMM and then extends the ideas to the development and implementation by applying this method in computational speech recognition. Basically, the system is able to recognize the spoken utterances by translating the speech waveform into a set of feature vectors using Mel Frequency Cepstral Coefficients (MFCC) technique, which then estimates the observation likelihood by using the Forward algorithm. The HMM parameters are estimated by applying the Baum Welch algorithm on previously trained samples. The most likely sequence is then decoded using Viterbi algorithm, thus producing the recognized word. This project focuses on all English digits from (Zero through Nine), which is based on isolated words structure. Two modules were developed, namely the isolated words speech recognition and the continuous speech recognition. Both modules were tested in both clean and noisy environments and showed relatively successful recognition rates. In clean environment and isolated words speech recognition module, the multi-speaker mode achieved 99.5% whereas the speaker-independent mode achieved 79.5%. In clean environment and continuous speech recognition module, the multi-speaker mode achieved 70% whereas the speaker-independent mode achieved 55%. However in noisy environment and isolated words speech recognition module, the multi-speaker mode achieved 88% whereas the speaker-independent mode achieved 67%. In noisy environment and continuous speech recognition module, the multi-speaker mode achieved 92.5% whereas the speaker-independent mode achieved 75%. These recognition rates are relatively successful if compared to similar systems.