Abstract
Feature selection (FS) is the process of finding an ideal set of features for a prediction model from a set of candidate features. A key step in designing a prediction model is reducing the size of the input feature set while increasing its usefulness. This reduces the complexity of a model, making the model run more quickly while allowing one to explain the usefulness of each individual feature more easily. Despite the desire to determine an ideal feature set, the process of FS can be time consuming and yield mixed results. FS is often partially automated with the use of algorithms. The quality of FS algorithms varies with many requiring long run times to produce mixed results. Few FS algorithms have an intuitive method of exploring a feature space, with most requiring one to determine a finite list of features to begin the algorithm. To address the shortcomings of many FS algorithms, Kaizen Programming with Enhanced Feature Discovery (KP-EFD) has been developed. KP-EFD is an evolutionary tool that uses a Genetic Programming (GP) framework combined with concepts of Continuous Improvement from Kaizen, a Japanese methodology, to intuitively expand and search a feature space for an ideal feature set. KP-EFD was tested for use with continuous or binary variables for the purpose of interpolating or extrapolating. The method performed well for some datasets and model types while falling short of acceptable for others; however, with additional improvements, KP-EFD has the potential to become very versatile, saving time and frustration when working with any type of data and prediction algorithm.
Library of Congress Subject Headings
Database management; Data mining; Genetic algorithms; Business planning--Data processing; Industrial management--Data processing; Time-series analysis
Publication Date
4-23-2020
Document Type
Thesis
Student Type
Graduate
Degree Name
Industrial and Systems Engineering (MS)
Department, Program, or Center
Industrial and Systems Engineering (KGCOE)
Advisor
Katie McConky
Advisor/Committee Member
Nasibeh Azadeh Fard
Recommended Citation
Stelmack, John, "Kaizen Programming with Enhanced Feature Discovery: An Automated Approach to Feature Selection and Feature Discovery for Prediction Models" (2020). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10419
Campus
RIT – Main Campus
Plan Codes
ISEE-MS