ENHANCING SPAM COMMENT DETECTION ON SOCIAL MEDIA WITH EMOJI FEATURE AND POST-COMMENT PAIRS APPROACH USING ENSEMBLE METHODS OF MACHINE LEARNING

Mrs. Sri Lavanya Sajja; Lokineni Yuktha; N. Esther; P. Charitha

Authors

Mrs. Sri Lavanya Sajja Author
Lokineni Yuktha Author
N. Esther Author
P. Charitha Author

Keywords:

improve detection performance, Performance in detecting, spam comment identification, SVM (RBF kernel), best average performance, social media,

Abstract

Whenever a well-known public person shares anything on social media, a lot of people are inspired to leave comments. Regretfully, not every remark is pertinent to the article. A portion of the comments are spam, which might impede the information's general flow. Two approaches were used in this study to solve problems with text spam identification on social media. The first tactic was using emoticons, which had been widely disregarded in previous research. Emojis are very widely used by social media users to express their intents. Unlike many spam detection algorithms that just looked at comment-only data, the second technique made advantage of stacked post-comment pairings. It was necessary for the post-comment pairings to determine if a remark related to the post context (i.e., not spam) or was spam. The SpamID-Pair dataset, which was obtained from social media, was used in this study to identify spam comments in Indonesian. Following a thorough analysis, it was determined that the stacked post-comment pairings, ensemble voting, and the emoji-text feature might improve detection performance (F1 and accuracy). Performance in detecting was further enhanced by adding manual features. According to the experiment, the soft voting ensemble approach for the best average performance and the SVM (RBF kernel) are the best stand-alone methods for spam comment identification.