Abstract
Automated app review analysis is an important avenue for extracting a variety of requirements-related information. Typically, a first step toward performing such analysis is preparing a training dataset, where developers(experts) identify a set of reviews and, manually, annotate them according to a given task. Having sufficiently large training data is important for both achieving a high prediction accuracy and avoiding over-fitting. Given millions of reviews, preparing a training set is laborious.We propose to incorporate active learning, a machine learning paradigm,in order to reduce the human effort involved in app review analysis. Our app review classification framework exploits three active learning strategies based on uncertainty sampling. We apply these strategies to an existing dataset of 4,400 app reviews for classifying app reviews as features, bugs, rating, and user experience. We find that active learning, compared to a training dataset chosen randomly, yields a significantly higher prediction accuracy under multiple scenarios.
Publication Date
9-2019
Document Type
Thesis
Student Type
Graduate
Degree Name
Software Engineering (MS)
Department, Program, or Center
Software Engineering (GCCIS)
Advisor
Pradeep Murukanah
Advisor/Committee Member
Mohamed Wiem Mkaouer
Advisor/Committee Member
J. Scott Hawker
Recommended Citation
Thimma Dhinakaran, Venkatesh, "App Review Analysis via Active Learning: Reducing Supervision Effort Without Compromising Classification Accuracy" (2019). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/10229
Campus
RIT – Main Campus