Abstract

Artificial intelligence (AI) systems frequently need to learn from human decision makers, which aids in scaling systems for mass use. We see this in automated financial decisions, manufacturing processes, self- driving cars, and content moderation on social networks. These systems are built on human values and trained to mimic human decision-making through human annotation. However, different groups of human annotators, including domain experts, annotate content differently, leading to disagreements. Some practitioners treat dissenting opinions as “label noise”. A common approach to dealing with disagreement is to consider a majority vote, which effectively conceals the disagreements from the trained model. Although majority voting is a common approach to resolving disagreements during model training, it can mask diverse perspectives. Even with balanced datasets, imbalanced perspectives within the data can lead to inherent biases in the model. Inherent biases can be hard to identify since they are exposed in instances such as AI models being unfair to groups from minority demographics. In this dissertation, we introduce two models to learn and predict human disagreements: CrowdOpinion and DisCo. CrowdOpinion, a semi-supervised learning approach that models disagreements across the entire annotator population by pooling similar data items. Next, we introduce DisCo, an encoder-decoder-based model designed to capture individual annotators’ characteristics and disagreements during data annotation. Finally, to further emphasize the need for human involvement in building AI models for content moderation systems. We conduct a noise audit of state-of-the-art offensive and hate speech classification models to underscore the importance of involving humans throughout the annotation process. As part of this au- dit, we conducted a human annotation study, focusing on annotators’ ability to identify offensive content for themselves and their capacity to identify offensive content for others through vicarious labeling. Our findings provide compelling evidence that modeling human disagreements is crucial for AI systems to effectively classify offensive and harmful content. We conclude by summarizing the scope of our research and outlining promising avenues for future exploration in this domain.

Library of Congress Subject Headings

Artificial intelligence--Data processing; Machine learning; Data sets--Classification; Data editing

Publication Date

4-2024

Document Type

Dissertation

Student Type

Graduate

Degree Name

Computing and Information Sciences (Ph.D.)

Department, Program, or Center

Computing and Information Sciences Ph.D, Department of

College

Golisano College of Computing and Information Sciences

Advisor

Christopher M. Homan

Advisor/Committee Member

Alexander G. Ororbia II

Advisor/Committee Member

Ashique KhudaBukhsh

Recommended Citation

Weerasooriya, Tharindu Cyril, "Learning from Disagreement in Human-Annotated Datasets" (2024). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/11899

Campus

RIT – Main Campus

Plan Codes

COMPIS-PHD

Download

COinS

Theses

Learning from Disagreement in Human-Annotated Datasets

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Search

Browse

Author Corner

RIT Links

Theses

Learning from Disagreement in Human-Annotated Datasets

Author

Abstract

Library of Congress Subject Headings

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Plan Codes

Share

Search

Browse

Author Corner

RIT Links