Speech Emotion Recognition Using Machine Learning
DOI:
https://doi.org/10.62647/Keywords:
Emotion recognition, speech signal processing, digital signal processing (DSP), feature extraction, feature matching, Mel-frequency cepstral coefficients (MFCCs), neural networks, automatic voice recognition, speech-based health applications.Abstract
Emotion is a complex state of human being that depicts the physical, physiological or mental condition of a person. According to the human sciences, emotion is a mental process of neural mechanisms and disorders. Digital processing of speech signal is very important for high-speed and precise automatic voice recognition technology. Nowadays it is being used for health care, telephony military and people with disabilities therefore the digital signal processes such as Feature Extraction and Feature Matching are the latest issues for study of voice signal. In order to extract valuable information from the speech signal, make decisions on the process, and obtain results, the data needs to be manipulated and analyzed. Basic method used for extracting the features of the voice signal is to find the Mel frequency cepstral coefficients. Mel-frequency cepstral coefficients (MFCCs) are the coefficients that collectively represent the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency. After calculating feature, neural networks are used to model the speech recognition. Based on the speech model the system decides whether the uttered speech matches what was prompted to utter.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Dr. K. Ashok Kumar, R Shailaja, C Sri varsha Reddy, K Srilekha (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











