DETECTING PHISHING ATTACKS VIA HYBRID MACHINE LEARNING MODELS BASED ON URL ANALYSIS

Authors

  • N Mounika Author
  • Mrs. A. Radha Rani Author

Keywords:

Phishing-detection models, PILU-90K dataset, Logistic Regression, TF-IDF feature extraction,, Cybersecurity, URL Classification

Abstract

To show how the effectiveness of phishing detection models can decrease over time, we trained a baseline model using older datasets and tested it on new URLs. The results suggested declining accuracy; we then carried out an extensive analysis on current phishing domains in order to discover new trends and tactics used by attackers. Creation of a brand new dataset dubbed Phishing Index Login URL-90,000 (PILU-90k) was of utmost necessity in supporting our research. The dataset contains a total of 60,000 legitimate URLs (being index and login pages) and 30,000 were phishing ones.Using this dataset, a Logistic Regression model connected with TF-IDF-feature extraction was built. This model had an impressive accuracy rate in recognizing login URLs, at 98.50%.

Downloads

Download data is not yet available.

Downloads

Published

09-11-2024

How to Cite

DETECTING PHISHING ATTACKS VIA HYBRID MACHINE LEARNING MODELS BASED ON URL ANALYSIS. (2024). International Journal of Information Technology and Computer Engineering, 12(4), 155-163. https://ijitce.org/index.php/ijitce/article/view/769