Abstract
Video Summarization refers to taking the important contents of a video and condensing it down to an easily consumable piece of data without having to watch the entire video. Currently, Millions of Videos are being recorded and shared every day. These videos range from the consumer level, such as a birthday party or wedding video, all the way up to industry such as film and television. We have constructed a model that seeks to address the problem of not being able to consume all the media that is being presented to you because of time constraints. To do this, we conduct two separate experiments. The first experiment examines the role of different parts of the summarization model, namely modality, sampling rate, and data scaling so that we better understand how summaries are generated. The second experiment utilizes these findings to create a model based in classification. We use classification as a means of interpreting a wide variety of types of video for summarization. By using classification to generate the video and audio features used by the summarizer, the classifier granularity is leveraged, and the maturity of classification problems is leveraged to accomplish a summarization task. We found that while scaling and sampling of the data have little effect on the overall summary, in each experiment the modality played a large role in the results. While many models exclude audio, we found that there are benefits to including this data when generating a video summary. We also found that the use of classification resulted in a separation of impacts for each modality, with video serving to construct the shape of the summary and audio determining importance score.
Library of Congress Subject Headings
Video recordings--Data processing; Automatic classification; Automatic abstracting; Image processing--Digital techniques
Publication Date
8-2020
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Engineering (MS)
Department, Program, or Center
Computer Engineering (KGCOE)
Advisor
Alexander Loui
Advisor/Committee Member
Corey Merkel
Advisor/Committee Member
Andres Kwasinski
Recommended Citation
Wells, Brendan, "Using Classification for Analysis of Multi-Modal Video Summarization" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10509
Campus
RIT – Main Campus
Plan Codes
CMPE-MS