Abstract

The exponential growth of deep learning has helped solve problems across different fields of study. Convolutional neural networks have become a go-to tool for extracting features from images. Similarly, variations of recurrent neural networks such as Long-Short Term Memory and Gated Recurrent Unit architectures do a good job extracting useful information from temporal data such as text and time series data. Although, these networks are good at extracting features for a particular modality, learning features across multiple modalities is still a challenging task. In this work, we develop a generative common vector space model in which similar concepts from different modalities are brought closer in a common latent space representation while dissimilar concepts are pushed far apart in this same space. The developed model not only aims at solving the cross-modal retrieval problem but also uses the vector generated by the common vector space model to generate real looking data. This work mainly focuses on image and text modalities. However, it can be extended to other modalities as well. We train and evaluate the performance of the model on Caltech CUB and Oxford-102 datasets.

Library of Congress Subject Headings

Machine learning; Neural networks (Computer science); Convolutions (Mathematics); Information retrieval; Data mining

Publication Date

2-2020

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Raymond Ptucha

Advisor/Committee Member

Alexander Loui

Advisor/Committee Member

Andres Kwasinski

Recommended Citation

Udaiyar, Premkumar, "Cross-modal data retrieval and generation using deep neural networks" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10562

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Cross-modal data retrieval and generation using deep neural networks

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Cross-modal data retrieval and generation using deep neural networks

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links