Providing machines with a robust visualization of multiple objects in a scene has a myriad of applications in the physical world. This research solves the task of multi-label image recognition using a deep learning approach. For most multi-label image recognition datasets, there are multiple objects within a single image and a single label can be seen many times throughout the dataset. Therefore, it is not efficient to classify each object in isolation, rather it is important to infer the inter-dependencies between the labels. To extract a latent representation of the pixels from an image, this work uses a convolutional network approach evaluating three different image feature extraction networks. In order to learn the label inter-dependencies, this work proposes a graph convolution network approach as compared to previous approaches such as probabilistic graph or recurrent neural networks. In the graph neural network approach, the image labels are first encoded into word embeddings. These serve as nodes on a graph. The correlations between these nodes are learned using graph neural networks. We investigate how to create the adjacency matrix without manual calculation of the label correlations in the respective datasets. This proposed approach is evaluated on the widely-used PASCAL VOC, MSCOCO, and NUS-WIDE multi-label image recognition datasets. The main evaluation metrics used will be mean average precision and overall F1 score, to show that the learned adjacency matrix method for labels along with the addition of visual attention for image features is able to achieve similar performance to manually calculating the label adjacency matrix.

Library of Congress Subject Headings

Neural networks (Computer science); Graph theory; Machine learning; Pattern recognition systems; Image processing--Digital techniques

Publication Date


Document Type


Student Type


Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)


Raymond Ptucha

Advisor/Committee Member

Andres Kwasinski

Advisor/Committee Member

Alexander Loui


RIT – Main Campus

Plan Codes