Abstract

Artificial intelligence (AI) systems frequently need to learn from human decision makers, which aids in scaling systems for mass use. We see this in automated financial decisions, manufacturing processes, self- driving cars, and content moderation on social networks. These systems are built on human values and trained to mimic human decision-making through human annotation. However, different groups of human annotators, including domain experts, annotate content differently, leading to disagreements. Some practitioners treat dissenting opinions as “label noise”. A common approach to dealing with disagreement is to consider a majority vote, which effectively conceals the disagreements from the trained model. Although majority voting is a common approach to resolving disagreements during model training, it can mask diverse perspectives. Even with balanced datasets, imbalanced perspectives within the data can lead to inherent biases in the model. Inherent biases can be hard to identify since they are exposed in instances such as AI models being unfair to groups from minority demographics. In this dissertation, we introduce two models to learn and predict human disagreements: CrowdOpinion and DisCo. CrowdOpinion, a semi-supervised learning approach that models disagreements across the entire annotator population by pooling similar data items. Next, we introduce DisCo, an encoder-decoder-based model designed to capture individual annotators’ characteristics and disagreements during data annotation. Finally, to further emphasize the need for human involvement in building AI models for content moderation systems. We conduct a noise audit of state-of-the-art offensive and hate speech classification models to underscore the importance of involving humans throughout the annotation process. As part of this au- dit, we conducted a human annotation study, focusing on annotators’ ability to identify offensive content for themselves and their capacity to identify offensive content for others through vicarious labeling. Our findings provide compelling evidence that modeling human disagreements is crucial for AI systems to effectively classify offensive and harmful content. We conclude by summarizing the scope of our research and outlining promising avenues for future exploration in this domain.

Publication Date

4-2024

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computing and Information Sciences Ph.D, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Christopher M. Homan

Advisor/Committee Member

Alexander G. Ororbia II

Advisor/Committee Member

Ashique KhudaBukhsh

Campus

RIT – Main Campus

Share

COinS