Abstract
Human brain is inherently good at pattern recognition. AI researchers have always struggled to emulate such levels of performance in machine vision algorithms. Building as an extension of the pattern recognition, humans are also exceptional in selectively learning and bridging the gap when there is missing data. Even when the missing data is in complex images and videos formats. Human brain is able to surmise and comprehend the scene with reasonable certainty given enough contextual information. We believe that selective learning aids in selectively filling the missing data. To this end we experiment with partial convolutions, and the networks can learn selectively. The idea behind partial convolutions is simple. We use the semantic segmentation masks which are obtained from our novel semantic segmentation network and apply convolutions only on the unmasked pixels of the images. When the image embedding is obtained at the end of the encoder, the data from the masked region of the image will be absent in the image embedding. This is encoded onto the latent vector space. When the images are rebuilt again by from the embedding with a decoder network, the object in the masked region is removed. Furthermore, the problem of image/video in-painting are reformulated as a domain transfer problem. This facilitates our network to be trained as semi-supervised learning. Our network uses less computation power while training semi-supervised, end-to-end and while offering performance close to the current state of the art. We test our network extensively on different datasets. The results while experimenting with partial convolutions and selective learning network have been promising. We have used places2 and cityscapes dataset to experiment on images and Davis 2017 Dataset as video dataset.
Library of Congress Subject Headings
Image processing--Digital techniques; Computer vision; Machine learning; Pattern recognition systems
Publication Date
4-2022
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Cory E. Merkel
Advisor/Committee Member
Alexander Loui
Advisor/Committee Member
Dongfang Liu
Recommended Citation
Mandya Nagaraju, Rajiv, "Semi-Supervised Video and Image in-painting" (2022). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11111
Campus
RIT – Main Campus
Plan Codes
CMPE-MS