Using Text Classification for Identifying Harmful Language on Social Media

Mrs.Neha Shireen; 2MOHAMMED MUJTABA AHMED; MOHD ABDUL MUHAIMIN; MOHAMMED RAYYANUDDIN QURESHI

doi:10.62647/

Using Text Classification for Identifying Harmful Language on Social Media

Authors

Mrs.Neha Shireen Author
2MOHAMMED MUJTABA AHMED Author
MOHD ABDUL MUHAIMIN Author
MOHAMMED RAYYANUDDIN QURESHI Author

DOI:

https://doi.org/10.62647/

Abstract

Worryingly, foul language is becoming more common in crowdsourced material across different social media sites. To use such rhetoric is to potentially intimidate or offend someone or some group. Researchers have been looking at automatic speech detection and prevention for some time now, and they've produced a variety of supervised approaches and training datasets. Our proposed architecture for text categorization in this work includes eight classifiers, three embedding approaches, a modular cleaning step, and a tokenizer. The results of our studies on the dataset we received from Twitter for the purpose of detecting inflammatory language are encouraging. The three AdaBoost, SVM, and MLP algorithms achieved the greatest average F1-score on the popular TF-IDF embedding approach when hyperparameter tuning was taken into account.
Index Terms—offensive language detection, social media, machine learning, text mining

Downloads

Download data is not yet available.

Downloads

Published

28-04-2025

Issue

Vol. 13 No. 2 (2025): Volume 13 Issue 2 2025

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

How to Cite

Using Text Classification for Identifying Harmful Language on Social Media. (2025). International Journal of Information Technology and Computer Engineering, 13(2), 880-884. https://doi.org/10.62647/