Speech Emotion Recognition using Machine Learning

D. Kiranmai; T. Narasimha; Sk. Khadar Munna; B. J. Siva ashish; M. Mahitha Reddy

Authors

D. Kiranmai Author
T. Narasimha Author
Sk. Khadar Munna Author
B. J. Siva ashish Author
M. Mahitha Reddy Author

Keywords:

Classification Algorithms, Convolutional Neural Networks (CNNs), Feature Extraction, Recurrent Neural Networks (RNNs), Speech Emotion Recognition (SER)

Abstract

Speech Emotion Recognition (SER) has gained prominence due to its diverse applications and the complexities of analysing emotional content from speech. Achieving 98% accuracy in SER highlights the effectiveness of advanced techniques in feature extraction and classification. Key methods include Mel-Frequency Cepstral Coefficients (MFCCs) for feature extraction, and various classification algorithms such as Support Vector Machines (SVMs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs) including Long Short-Term Memory (LSTM) networks, and Transformers. Hybrid approaches, like combining multiple classifiers and feature fusion, further enhance accuracy. This high level of performance underscores the impact of integrating sophisticated algorithms to overcome the challenges in subjective emotion detection from speech signals.