Abstract

Video Summarization refers to taking the important contents of a video and condensing it down to an easily consumable piece of data without having to watch the entire video. Currently, Millions of Videos are being recorded and shared every day. These videos range from the consumer level, such as a birthday party or wedding video, all the way up to industry such as film and television. We have constructed a model that seeks to address the problem of not being able to consume all the media that is being presented to you because of time constraints. To do this, we conduct two separate experiments. The first experiment examines the role of different parts of the summarization model, namely modality, sampling rate, and data scaling so that we better understand how summaries are generated. The second experiment utilizes these findings to create a model based in classification. We use classification as a means of interpreting a wide variety of types of video for summarization. By using classification to generate the video and audio features used by the summarizer, the classifier granularity is leveraged, and the maturity of classification problems is leveraged to accomplish a summarization task. We found that while scaling and sampling of the data have little effect on the overall summary, in each experiment the modality played a large role in the results. While many models exclude audio, we found that there are benefits to including this data when generating a video summary. We also found that the use of classification resulted in a separation of impacts for each modality, with video serving to construct the shape of the summary and audio determining importance score.

Library of Congress Subject Headings

Video recordings--Data processing; Automatic classification; Automatic abstracting; Image processing--Digital techniques

Publication Date

8-2020

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Engineering (MS)

Department, Program, or Center

Computer Engineering (KGCOE)

Advisor

Alexander Loui

Advisor/Committee Member

Corey Merkel

Advisor/Committee Member

Andres Kwasinski

Recommended Citation

Wells, Brendan, "Using Classification for Analysis of Multi-Modal Video Summarization" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10509

Campus

RIT – Main Campus

Plan Codes

CMPE-MS

Download

COinS

Theses

Using Classification for Analysis of Multi-Modal Video Summarization

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Using Classification for Analysis of Multi-Modal Video Summarization

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links