Abstract
Over the recent past, the world has experienced a digital revolution that has profoundly transformed many aspects of human life. This revolution has redefined the way individuals interact, how businesses operate, and how societies function. For businesses, operations have been fundamentally altered, changing the way they interact with customers and other businesses through digital tools and platforms used for transactions. Phishing, a deceitful approach, leverages technological and social tricks to access sensitive personal information, including financial details. A common attack mode involves installing harmful software on electronic devices to gain unauthorized access to personal information, particularly usernames and passwords of consumers’ accounts. These softwares are disseminated through compromised URLs, transferred to users via forum postings, instant messages, text messages, telephone calls, and emails. This paper proposes a Machine Learning-based solution capable of predicting the true nature of URLs as Phishing or Legitimate, aiming to assist users or system administrators in detecting phishing activities before opening malicious URLs. The system is based on an XGBoosted decision tree model trained on 180,000 URLs. The model performed well, correctly classifying 99.44% of phishing websites and 95.84% of legitimate websites on the testing set. Overall, it classified 98.75% of the websites on the testing set accurately. The precision of the model was 99.01%, indicating that only about 0.99% of websites detected as phishing websites are legitimate. The F1 score of the model is 99.22%. The study concluded that ML models can be useful in detecting phishing activities and recommended further research with Deep Learning techniques capable of handling time-dependent features.
Publication Date
12-13-2023
Document Type
Master's Project
Student Type
Graduate
Degree Name
Professional Studies (MS)
Department, Program, or Center
Graduate Programs & Research
Advisor
Ehsan Warriach
Recommended Citation
Alfalasi, Hamad, "TOWARDS AN EFFICIENT MACHINE LEARNING BASED PHISH DETECTION SYSTEM" (2023). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12333
Campus
RIT Dubai
