ENHANCING SPAM COMMENT DETECTION ON SOCIAL MEDIA WITH EMOJI FEATURE AND POST-COMMENT PAIRS APPROACH USING ENSEMBLE METHODS OF MACHINE LEARNING
Keywords:
improve detection performance, Performance in detecting, spam comment identification, SVM (RBF kernel), best average performance, social media,Abstract
Whenever a well-known public person shares anything on social media, a lot of people are inspired to leave comments. Regretfully, not every remark is pertinent to the article. A portion of the comments are spam, which might impede the information's general flow. Two approaches were used in this study to solve problems with text spam identification on social media. The first tactic was using emoticons, which had been widely disregarded in previous research. Emojis are very widely used by social media users to express their intents. Unlike many spam detection algorithms that just looked at comment-only data, the second technique made advantage of stacked post-comment pairings. It was necessary for the post-comment pairings to determine if a remark related to the post context (i.e., not spam) or was spam. The SpamID-Pair dataset, which was obtained from social media, was used in this study to identify spam comments in Indonesian. Following a thorough analysis, it was determined that the stacked post-comment pairings, ensemble voting, and the emoji-text feature might improve detection performance (F1 and accuracy). Performance in detecting was further enhanced by adding manual features. According to the experiment, the soft voting ensemble approach for the best average performance and the SVM (RBF kernel) are the best stand-alone methods for spam comment identification.
Downloads
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.