Abstract

Identifier naming is a core element of software engineering and program comprehension. While existing literature suggests that naming consistency is ideal, there is limited empirical data on identifier homonyms, instances where the same name is used for different behaviors. This thesis investigates these naming patterns through a manual analysis inspired by grounded theory, supported by static analysis with srcML across three open source projects in Java, C and Python. By establishing a behavioral framework through axial coding, this research classifies 771 identifiers into functional categories to document their presence and role within the code. The results reveal that naming redundancy is not a simple error, but a complex phenomenon where consistent clones in infrastructure code coexist with homonyms in domain specific logic. Instead of pre-judging these patterns as harmful this study focuses on mapping and documenting their existence. The final contribution is a manually verified dataset and a behavioral taxonomy, providing a foundation for future research to evaluate the actual impact of these naming choices on software maintenance.

Publication Date

5-8-2026

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Christian D. Newman

Advisor/Committee Member

Andy Meneely

Advisor/Committee Member

Robert St. Jacques

Campus

RIT – Main Campus

Share

COinS