Emulating human cognitive approach for speech emotion recognition using MLP and GenSoFNN

Speech emotion recognition field is growing due to the increasing needs for effective human-computer interaction. There are many approaches in term of features extraction methods coupled with classifiers to obtain optimum performance. However, none can claim superiority as it is very data-depen...

Full description

Bibliographic Details
Main Authors: Kamaruddin, Norhaslinda, Abdul Rahman, Abdul Wahab
Format: Conference or Workshop Item
Language:English
English
Published: 2013
Subjects:
Online Access:http://irep.iium.edu.my/31010/
http://irep.iium.edu.my/31010/
http://irep.iium.edu.my/31010/1/Table_of_Content.pdf
http://irep.iium.edu.my/31010/2/106.pdf
Description
Summary:Speech emotion recognition field is growing due to the increasing needs for effective human-computer interaction. There are many approaches in term of features extraction methods coupled with classifiers to obtain optimum performance. However, none can claim superiority as it is very data-dependant and domain oriented. In this paper, the appropriate sets of features are investigated using segregation method and feature ranking algorithm of Automatic Relevance Determination (ARD) [1]. Two popular classifiers of Multi Layer Perceptron (MLP) [2] and Generic Self-organizing Fuzzy Neural Network (GenSoFNN) [3] are employed to discriminate emotions in the data corpus used in the FAU Aibo Emotion Corpus [4, 5]. The experimental results shows that Mel Frequency Cepstral Coefficient (MFCC) [6] features are able to yield comparable accuracy with baseline result [5]. In addition, it is observed that MLP can perform slightly better than GenSoFNN. Hence, such system envisages that appropriate combination of features extracted with good classifiers is fundamental for the good speech emotion recognition system.