Abstract
In 2020, the Federal Bureau of Investigation (FBI) found phishing to be the most common cybercrime, with a record number of complaints from Americans reporting losses exceeding $4.1 billion. Various phishing prevention methods exist; however, these methods are usually reactionary in nature as they activate only after a phishing campaign has been launched. Priming people ahead of time with the knowledge of which phishing topic is more likely to occur could be an effective proactive phishing prevention strategy. It has been noted that the volume of phishing emails tended to increase around key calendar dates and during times of uncertainty. This thesis aimed to create a classifier to predict which phishing topics have an increased likelihood of occurring in reference to an external event. After distilling around 1.2 million phishes until only meaningful words remained, a Latent Dirichlet allocation (LDA) topic model uncovered 90 latent phishing topics. On average, human evaluators agreed with the composition of a topic 74% of the time in one of the phishing topic evaluation tasks, showing an accordance of human judgment to the topics produced by the LDA model. Each topic was turned into a timeseries by creating a frequency count over the dataset’s two-year timespan. This time-series was changed into an intensity count to highlight the days of increased phishing activity. All phishing topics were analyzed and reviewed for influencing events. After the review, ten topics were identified to have external events that could have possibly influenced their respective intensities. After performing the intervention analysis, none of the selected topics were found to correlate with the identified external event. The analysis stopped here, and no predictive classifiers were pursued. With this dataset, temporal patterns coupled with external events were not able to predict the likelihood of a phishing attack.
Library of Congress Subject Headings
Phishing--Forecasting; Phishing--Classification; Computer crimes--Prevention
Publication Date
11-2021
Document Type
Thesis
Student Type
Graduate
Degree Name
Industrial and Systems Engineering (MS)
Department, Program, or Center
Industrial and Systems Engineering (KGCOE)
Advisor
Katie McConky
Advisor/Committee Member
Ruben Proano
Recommended Citation
Bliss, Erika, "Analyzing Temporal Patterns in Phishing Email Topics" (2021). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11001
Campus
RIT – Main Campus
Plan Codes
ISEE-MS