Abstract
Accidental token or credential leakage presents a significant concern within digital environments. Current detection methods of such secrets employ ruleset-based techniques to identify secret information. These methods use pattern recognition within strings (i.e., regex rules) to pinpoint characteristics that resemble various types of secrets. However, regex does not allow for detection of secrets that lack specific patterns, such as passwords. This research addresses the possibility of using heuristics and machine learning to develop a reliable and accurate method for determining if a given string is merely a piece of inconsequential data or a leaked secret requiring timely attention, without the limitations of a traditional pattern based system like regex. This thesis proposes two new techniques to detect secrets, one using a solution vector and the other using a machine learning model based on a neural network.
Library of Congress Subject Headings
Computer security--Automation; Regular expressions (Computer science); Machine learning; Neural networks (Computer science)
Publication Date
5-2025
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
College
Golisano College of Computing and Information Sciences
Advisor
Ivona Bezakova
Advisor/Committee Member
Warren Carithers
Advisor/Committee Member
Dan Pless
Recommended Citation
Burdick-Pless, Jesse, "Beyond RegEx – Heuristic-based Secret Detection" (2025). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12111
Campus
RIT – Main Campus
Plan Codes
COMPSCI-MS