Automated interpretation of human emotion has become increasingly important as human-computer interactions become ubiquitous. Affective computing is a field of computer science concerned with recognizing, analyzing and interpreting human emotions in a range of media, including audio, video, and text. Social media, in particular, are rich in expressions of people's moods, opinions, and sentiments. This thesis focuses on predicting the emotional intensity expressed on the social network Twitter. In this study, we use lexical features, sentiment and emotion lexicons to extract features from tweets, messages of 280 characters or less shared on Twitter. We also use a form of transfer learning – word and sentence embeddings extracted from neural networks trained on large corpora. The estimation of emotional intensity is a regression task and we use linear and tree-based models for this task. We compare the results of these individual models as well as making a final ensemble model that predicts the emotional intensity of tweets by combining the output of the individual models. We also use lexical features and word embeddings to train a recently introduced model designed to handle data with sparse or rare features. This model combines LASSO regularization with grouped features. Finally, an error analysis is conducted and areas that need to be improved are emphasized.

Library of Congress Subject Headings

Semantic computing; Twitter--Data processing; Text processing (Computer science); Emotions--Data processing; Human-computer interaction; Neural networks (Computer science)

Publication Date


Document Type


Student Type


Degree Name

Applied Statistics (MS)

Department, Program, or Center

School of Mathematical Sciences (COS)


Ernest Fokoué

Advisor/Committee Member

Robert Parody

Advisor/Committee Member

Linlin Chen


RIT – Main Campus

Plan Codes