Abstract
Historically it has been difficult to measure the deviation in the notion of a concept. Several schemes have been proposed to attack this challenging problem [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]. The central notion of all these efforts is to detect the change point where the data mining model deviates significantly with respect to the data characteristics that it was trained or built on. The process of detecting such change points is often termed as concept drift. Current state of algorithms assume attribute independence, view the problem as a supervised learning problem and also need tagged data. The proposed algorithm does not make any assumption among attribute independence and uses the covariance summary to detect concept drift in an unsupervised setting. The algorithm proposed in this thesis monitors the underlying characteristics of the input data, maintains data summaries of the various snapshots in time and utilizes effective distance metrics to determine when concept drifts. The technique was evaluated against synthetic and real data sets.
Library of Congress Subject Headings
Data mining; Concepts; Machine learning
Publication Date
2006
Document Type
Thesis
Student Type
Graduate
Degree Name
Computer Science (MS)
Department, Program, or Center
Computer Science (GCCIS)
Advisor
Ankur Teredesai
Advisor/Committee Member
Roger Gaborski
Advisor/Committee Member
Hans-Peter Bischop
Recommended Citation
Chakravorty, Mamidi Sree Kalyan, "Gaussian Mixture Approach to Detect Drift" (2006). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/8057
Campus
RIT – Main Campus
Comments
Physical copy available from RIT's Wallace Library at QA76.9.D343 C33 2006