Abstract

In this research project, a system is proposed to aid the visually impaired by providing partial contextual information of the surroundings using 360° view camera combined with deep learning is proposed. The system uses a 360° view camera with a mobile device to capture surrounding scene information and provide contextual information to the user in the form of audio. The system could also be used for other applications such as logo detection which visually impaired users can use for shopping assistance.

The scene information from the spherical camera feed is classified by identifying objects that contain contextual information of the scene. That is achieved using convolutional neural networks (CNN) for classification by leveraging CNN transfer learning properties using the pre-trained VGG-19 network. There are two challenges related to this paper, a classification and a segmentation challenge. As an initial prototype, we have experimented with general classes such restaurants, coffee shops and street signs. We have achieved a 92.8% classification accuracy in this research project.

Library of Congress Subject Headings

People with visual disabilities--Services for; Computer vision; Machine learning; Image processing--Digital techniques; Classification

Publication Date

8-2017

Document Type

Thesis

Student Type

Graduate

Degree Name

Electrical Engineering (MS)

Department, Program, or Center

Electrical Engineering (KGCOE)

Advisor

Ferat Sahin

Advisor/Committee Member

Gill Tsouri

Advisor/Committee Member

Sildomar Monteiro

Campus

RIT – Main Campus

Plan Codes

EEEE-MS

Share

COinS