Abstract
Identifier naming is a core element of software engineering and program comprehension. While existing literature suggests that naming consistency is ideal, there is limited empirical data on identifier homonyms, instances where the same name is used for different behaviors. This thesis investigates these naming patterns through a manual analysis inspired by grounded theory, supported by static analysis with srcML across three open source projects in Java, C and Python. By establishing a behavioral framework through axial coding, this research classifies 771 identifiers into functional categories to document their presence and role within the code. The results reveal that naming redundancy is not a simple error, but a complex phenomenon where consistent clones in infrastructure code coexist with homonyms in domain specific logic. Instead of pre-judging these patterns as harmful this study focuses on mapping and documenting their existence. The final contribution is a manually verified dataset and a behavioral taxonomy, providing a foundation for future research to evaluate the actual impact of these naming choices on software maintenance.
Publication Date
5-8-2026
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
Department, Program, or Center
Computer Science, Department of
College
Golisano College of Computing and Information Sciences
Advisor
Christian D. Newman
Advisor/Committee Member
Andy Meneely
Advisor/Committee Member
Robert St. Jacques
Recommended Citation
Palomino Lau, Jose, "A comparative Static Analysis of Identifier Clones and Homonyms in Open Source Projects" (2026). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12657
Campus
RIT – Main Campus
