Statistical machine learning uses data to model a relationship between many parameters, or explanatory variables, and a response variable. The adaptive boosting algorithm is a machine learning method that can be used to model relationships of classification data. This method uses a weak base learner to improve accuracy of predicting the correct response class from a set of variables. Because of its learnability, adaptive boosting yields an exponentially decreasing empirical error. From this, an empirical error bound can be derived from the boosting algorithm. This empirical error bound inspires us to see if there is a generalized error bound and what form it takes. Evidence from boosting several real datasets will show that the generalized error follows the same shape as the empirical error, thus suggesting that a shift of the empirical error bound can create a generalized error bound. By simulating datasets from random and varying their characteristics based on criteria that seem to affect the shift, we can boost them and derive a function by which to shift the empirical error bound. We will record the test error of the boosted simulated datasets and build a regression model with that as the response and the varying characteristics of the datasets as the explanatory variables. The final regression model gives us the predicted outcome of the difference between the generalized error and the empirical error, thus enabling us to derive the suggested generalized error bound.

Library of Congress Subject Headings

Machine learning--Mathematical models

Publication Date


Document Type


Student Type


Degree Name

Applied Statistics (MS)

Department, Program, or Center

School of Mathematical Sciences (COS)


Ernest Fokoué

Advisor/Committee Member

Steven LaLonde

Advisor/Committee Member

Mei Nagappan


Physical copy available from RIT's Wallace Library at Q325.5 .H68 2016


RIT – Main Campus