Noise in images gets introduced at almost every stage of the camera image signal processing pipeline (ISP). Camera companies provide software that cleans most of the noise added at each stage. Even after noise removal is done by the camera software, different noise patterns with different intensities remain in the image. With advances in deep learning, the algorithms are archi- tectured end-to-end. In the present time, machine learning and deep learning models work as end-to-end systems with a special-purpose feature extraction phase. This thesis focuses on the removal of any residual noise in images as performed during the feature extraction stages. The feature extraction process is done by using the classic segmentation architecture, U-Nets. Traditionally, segmentation models have helped with identifying the locations of objects in images. In this thesis, a U-Net based architecture has been used to identify important regions in an image in order to localize background noise. With the removal of this noise, the resulting images created are cleaner and provide better content for other tasks like Image Classification, Object Segmentation, and Scene Understanding. MNIST and Fashion-MNIST datasets were used to train the prototypes of the proposed architectures. To build an effective system, a noise model was created to reflect the properties of true noise found in images. Various Gaussian and Speckle noise models were used during the initial prototyping phase, and for the final prototype, a combination of the noise models was used. This combination of noise models represents the true occurrences of noise in images found in nature. Due to the occurrences of multiple types of noise in images, modeling a realistic representation of this noise was done using a Mixture of Gaussians and tested on a complex dataset, ImageNet. The proposed system worked well in denoising complex invisible noise, like adversarial noise, from these images. The effectiveness of the pro- posed approach was evaluated using signal-structure metrics such as PSNR and SSIM, along with metrics such as Precision, Recall, and F1-score that are used to quantify the improvements made during computer vision tasks.

Library of Congress Subject Headings

Image segmentation; Image processing--Digital techniques; Machine learning; Pattern recognition systems

Publication Date


Document Type


Student Type


Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science (GCCIS)


Thomas B. Kinsman

Advisor/Committee Member

Alexander G. Ororbia II

Advisor/Committee Member

Joe Geigel


RIT – Main Campus

Plan Codes