Abstract

Phishing has remained a serious threat to cybersecurity, as this type of attack can easily bypass detection systems that are either rule-based or blacklist-based. The proposed thesis work will present a solution that will make use of both statistical analysis and machine learning to correctly identify a phishing website. The solution will make use of a hybrid approach comprising statistical-based preprocessing methodologies, such as PCA and decision tree-based feature selection, to filter the crucial URL features that a website may possess. A carefully balanced dataset has been utilized, as well as a non-parametric approach utilizing the Mann-Whitney U-test to validate if the features are statistically important. The thesis will conclude by building a neural network model capable of classifying a phishing website with accuracy through optimized URL features.

Publication Date

12-2025

Document Type

Thesis

Student Type

Graduate

Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research

Advisor

Sanjay Modak

Advisor/Committee Member

Hammou Messatfa

Campus

RIT Dubai

Share

COinS