Abstract

Accidental token or credential leakage presents a significant concern within digital environments. Current detection methods of such secrets employ ruleset-based techniques to identify secret information. These methods use pattern recognition within strings (i.e., regex rules) to pinpoint characteristics that resemble various types of secrets. However, regex does not allow for detection of secrets that lack specific patterns, such as passwords. This research addresses the possibility of using heuristics and machine learning to develop a reliable and accurate method for determining if a given string is merely a piece of inconsequential data or a leaked secret requiring timely attention, without the limitations of a traditional pattern based system like regex. This thesis proposes two new techniques to detect secrets, one using a solution vector and the other using a machine learning model based on a neural network.

Library of Congress Subject Headings

Computer security--Automation; Regular expressions (Computer science); Machine learning; Neural networks (Computer science)

Publication Date

5-2025

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Science (MS)

College

Golisano College of Computing and Information Sciences

Advisor

Ivona Bezakova

Advisor/Committee Member

Warren Carithers

Advisor/Committee Member

Dan Pless

Campus

RIT – Main Campus

Plan Codes

COMPSCI-MS

Share

COinS