99 AI-Driven Phishing Detection Using URL and Content-Based Features
DOI:
https://doi.org/10.62647/Keywords:
phishing detection, random forest, URL analysis, HTML features, cybersecurity, AI in security, blacklists, feature engineering, entropy, redirectionAbstract
Phishing remains a critical threat in cybersecurity, leveraging social engineering and deceptive website tactics to
steal user credentials and financial information. Traditional countermeasures, such as URL blacklists and static
rule sets, struggle to adapt to the evolving sophistication of phishing campaigns. This research introduces an AIbased
detection system utilizing a random forest classifier trained on a diverse dataset of 40,000 labeled URLs,
collected from verified phishing databases and legitimate sources. Our approach combines 25 handcrafted features
spanning URL structure, HTML content analysis, and domain metadata. Key features include domain age, HTTPS
usage, number of scripts, URL entropy, and redirection behavior. The proposed model achieves a detection
accuracy of 96.2% and a false-positive rate of 2.3%, significantly outperforming traditional blacklist methods. We
also conduct feature importance analysis, identifying the most discriminative indicators of phishing activity.
Lightweight by design, the system is deployable as a real-time browser plugin or email gateway filter. This study
contributes to adaptive threat prevention and lays the groundwork for integrating deep learning techniques in
future phishing defense solutions.
Downloads
Downloads
Published
Issue
Section
License
Copyright (c) 2018 Author

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.











