In 2020, the Federal Bureau of Investigation (FBI) found phishing to be the most common cybercrime, with a record number of complaints from Americans reporting losses exceeding $4.1 billion. Various phishing prevention methods exist; however, these methods are usually reactionary in nature as they activate only after a phishing campaign has been launched. Priming people ahead of time with the knowledge of which phishing topic is more likely to occur could be an effective proactive phishing prevention strategy. It has been noted that the volume of phishing emails tended to increase around key calendar dates and during times of uncertainty. This thesis aimed to create a classifier to predict which phishing topics have an increased likelihood of occurring in reference to an external event. After distilling around 1.2 million phishes until only meaningful words remained, a Latent Dirichlet allocation (LDA) topic model uncovered 90 latent phishing topics. On average, human evaluators agreed with the composition of a topic 74% of the time in one of the phishing topic evaluation tasks, showing an accordance of human judgment to the topics produced by the LDA model. Each topic was turned into a timeseries by creating a frequency count over the dataset’s two-year timespan. This time-series was changed into an intensity count to highlight the days of increased phishing activity. All phishing topics were analyzed and reviewed for influencing events. After the review, ten topics were identified to have external events that could have possibly influenced their respective intensities. After performing the intervention analysis, none of the selected topics were found to correlate with the identified external event. The analysis stopped here, and no predictive classifiers were pursued. With this dataset, temporal patterns coupled with external events were not able to predict the likelihood of a phishing attack.
Library of Congress Subject Headings
Phishing--Forecasting; Phishing--Classification; Computer crimes--Prevention
Industrial and Systems Engineering (MS)
Department, Program, or Center
Industrial and Systems Engineering (KGCOE)
Bliss, Erika, "Analyzing Temporal Patterns in Phishing Email Topics" (2021). Thesis. Rochester Institute of Technology. Accessed from
RIT – Main Campus