In this thesis, we discuss the importance of data normalization in deep learning and its relationship with generalization. Normalization is a staple of deep learning architectures and has been shown to improve the stability and generalizability of deep learning models, yet the reason why these normalization techniques work is still unknown and is an active area of research. Inspired by this uncertainty, we explore how different normalization techniques perform when employed in different deep learning architectures, while also exploring generalization and metrics associated with generalization in congruence with our investigation into normalization. The goal behind our experiments was to investigate if there exist any identifiable trends for the different normalization methods across an array of different training schemes with respect to the various metrics employed. We found that class similarity was seemingly the strongest predictor for train accuracy, test accuracy, and generalization ratio across all employed metrics. Overall, BatchNorm and EvoNormBO generally performed the best on measures of test and train accuracy, while InstanceNorm and Plain performed the worst.

Library of Congress Subject Headings

Deep learning (Machine learning); Data structures (Computer science); Computational learning theory

Publication Date


Document Type


Student Type


Degree Name

Applied and Computational Mathematics (MS)

Department, Program, or Center

School of Mathematical Sciences (COS)


Nathan Cahill

Advisor/Committee Member

Ernest Fokoue

Advisor/Committee Member

Matthew Hoffman


RIT – Main Campus

Plan Codes