ENHANCING PHISHING DETECTION: MACHINE LEARNING APPROACH WITH FEATURE SELECTION AND DEEP LEARNING MODELS

1 B Rajasri, 2 D Sainath Reddy, 3 B Rushikesh, 4 E Ravi Teja, 5 B Mukhesh

doi:10.62643/

Authors

1 B Rajasri, 2 D Sainath Reddy, 3 B Rushikesh, 4 E Ravi Teja, 5 B Mukhesh Author

DOI:

https://doi.org/10.62643/

Abstract

Phishing attacks have become a major cybersecurity threat, exploiting users through malicious URLs and deceptive websites. This project proposes a hybrid machine learning framework for phishing URL detection that combines the strengths of XGBoost and a Deep Neural Network (DNN). The system begins with data preprocessing, including feature extraction, scaling, and selection to ensure highquality input. The processed data is then split into training and testing sets. The dataset used in this project is the Phishing Website Dataset (e.g.,https://www.kaggle.com/datasets/mdsultanulislamovi/phishingwebsite-detectiondatasets), which contains labeled instances of legitimate and phishing URLs. XGBoost captures complex patterns in structured data, while the DNN, implemented using PyTorch, learns deep feature representations with dropout to prevent overfitting. The predictions from both models are combined using a hybrid approach to enhance accuracy and robustness. Performance is evaluated using metrics such as accuracy, confusion matrix, and loss curves. Experimental results demonstrate that the hybrid model achieves an accuracy of approximately 96–98%, outperforming individual models and providing reliable and efficient phishing detection. This approach contributes to improved cybersecurity by enabling accurate identification of malicious URLs and enhancing protection against phishing attacks.