99 AI-Driven Phishing Detection Using URL and Content-Based Features

Authors

  • E. A Feukeu Dept. Computer System Engineering, Tshwane University of Technology, Pretoria, South Africa Author

DOI:

https://doi.org/10.62647/

Keywords:

phishing detection, random forest, URL analysis, HTML features, cybersecurity, AI in security, blacklists, feature engineering, entropy, redirection

Abstract

Phishing remains a critical threat in cybersecurity, leveraging social engineering and deceptive website tactics to
steal user credentials and financial information. Traditional countermeasures, such as URL blacklists and static
rule sets, struggle to adapt to the evolving sophistication of phishing campaigns. This research introduces an AIbased
detection system utilizing a random forest classifier trained on a diverse dataset of 40,000 labeled URLs,
collected from verified phishing databases and legitimate sources. Our approach combines 25 handcrafted features
spanning URL structure, HTML content analysis, and domain metadata. Key features include domain age, HTTPS
usage, number of scripts, URL entropy, and redirection behavior. The proposed model achieves a detection
accuracy of 96.2% and a false-positive rate of 2.3%, significantly outperforming traditional blacklist methods. We
also conduct feature importance analysis, identifying the most discriminative indicators of phishing activity.
Lightweight by design, the system is deployable as a real-time browser plugin or email gateway filter. This study
contributes to adaptive threat prevention and lays the groundwork for integrating deep learning techniques in
future phishing defense solutions.

Downloads

Download data is not yet available.

Downloads

Published

30-12-2018

How to Cite

99 AI-Driven Phishing Detection Using URL and Content-Based Features. (2018). International Journal of Information Technology and Computer Engineering, 6(4), 99-106. https://doi.org/10.62647/