Abstract

The growing use of the internet resulted in emerging of new websites every day (Total number of Websites - Internet Live Stats, 2020). Web surfing has become important for everyone regardless of their occupation, age or location. However, as the use of the internet is increasing so is the vulnerability to malware attacks through malicious websites (Softpedia, 2016). Identifying and dealing with such malicious website has been quite difficult in the past as it is quite challenging to separate good websites from bad websites. However, by using machine learning algorithms on large datasets it is now possible to detect such websites beforehand. Classifiers trained using algorithms such as logistic regression and Support Vector Machine (SVM) can be used to detect malicious websites and the users can be warned about the risk before they visit such sites. This project focuses on using a variety of different classification algorithms to distinguish whether a website is malicious or not using the Kaggle Malicious and Benign Website Dataset. We have showcased that it is possible to detect malicious websites with a reasonable amount of certainty (90% of the 75 malicious websites in the test set were identified) using machine learning models. We have also determined the features that were critical in predicting the likelihood of a website being malicious. Most of our key features are easily available (URL Length, number of Special characters, Country, Age of website).

Publication Date

4-20-2020

Document Type

Master's Project

Student Type

Graduate

Degree Name

Professional Studies (MS)

Department, Program, or Center

Graduate Programs & Research (Dubai)

Advisor

Ehsan Warriach

Recommended Citation

Al Tamimi, Saeed Ahmad, "Detecting Malicious Websites Using Machine Learning" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10723

Campus

RIT Dubai

Download

COinS

Theses

Detecting Malicious Websites Using Machine Learning

Abstract

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Recommended Citation

Campus

Search

Browse

Author Corner

RIT Links

Theses

Detecting Malicious Websites Using Machine Learning

Author

Abstract

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Recommended Citation

Campus

Share

Search

Browse

Author Corner

RIT Links