AUDIO BASED HATESPEECH CLASSIFICATION FROM ONLINE SHORT-FORM VIDEOS

Authors

  • G.Manasa Author
  • A. Akshay Author
  • B. Shiva Ganesh Sai Raj Author
  • C. Sripriya Author
  • D. Tejaswini Author

Keywords:

hate speech, tiktok, audio classification, machine learning, speech processing

Abstract

In this study, we pioneer the development of an audio-based hate speech classifier from online, short-form TikTok videos using traditional machine learning algorithms such as Logistic Regression, Random Forest, and Support Vector Machines. We scraped over 4746 videos using the TikTok API tool and extracted audio-based features such as MFCCs, Spectral Centroid,Rolloff, Bandwidth, Zero-Crossing Rate, and Chroma values as primary feature sets. Results show that using the extracted predictors for hate speech detection can obtain upto 78.5% accuracy on an optimized Random Forest model, crossing the 50% benchmark for models in this task. In addition, comparing the Information Gain scores and globally learned model weights identified that Spectral Rolloff and MFCCs are top predictors in discriminating hate speech for the Filipino language.

Downloads

Download data is not yet available.

Downloads

Published

06-01-2024

How to Cite

AUDIO BASED HATESPEECH CLASSIFICATION FROM ONLINE SHORT-FORM VIDEOS. (2024). International Journal of Information Technology and Computer Engineering, 12(1), 514-521. https://ijitce.org/index.php/ijitce/article/view/578