Identifying abnormality in videos is an area of active research. Most of the work makes extensive use of supervised approaches, even though these methods often give superior performances the major drawback being abnormalities cannot be conformed to select classes, thus the need for unsupervised models to approach this task. We introduce Dirichlet Process Mixture Models (DPMM) along with Autoencoders to learn the normality in the data. Autoencoders have been extensively used in the literature for feature extraction and enable us to capture rich features into a small dimensional space. We use the Stick Breaking formulation of the DPMM which is a non-parametric version of the Gaussian mixture model and it can create new clusters as more and more data is observed. We exploit this property of the stick-breaking model to incorporate online learning and prediction of data in an unsupervised manner. We first introduce a two-phase model with feature extraction through autoencoders in the first step and then model inference through the DPMM in the second step. We seek to improve upon this model by introducing a model that does both the feature extraction and model inference in an end-to-end fashion by modeling the stick-breaking formulation to the Variational Autoencoder (VAE) setting.

Library of Congress Subject Headings

Video surveillance--Data processing; Image processing--Digital techniques; Optical pattern recognition; Machine learning

Publication Date


Document Type


Student Type


Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science (GCCIS)


Ifeoma Nwogu

Advisor/Committee Member

Linwei Wang

Advisor/Committee Member

Matthew Hoffman


RIT – Main Campus

Plan Codes