Theses

Two Centuries of Sexism in British Parliament: A Computational Analysis of Women’s Representation in the Hansard Corpus

Mandira SawkarFollow

Abstract

Detecting sexism and gender bias in text is a central problem in NLP, yet most existing work focuses on short, informal content such as social media posts. In contrast, much less is known about how sexism operates in long-form, formal discourse, where it is often expressed indirectly through rhetorical framing, appeals to tradition, or references to gender roles. Parliamentary debates offer a particularly rich setting for studying such language, but their scale, linguistic complexity, use of archaic English, and the presence of distributional shift across centuries make systematic analysis challenging and historically infeasible without automation. In this work, we analyse 6,531 speeches on women’s political rights from over 200 years of UK parliamentary debate (Hansard, 1803–2005), using large language models to classify both stance and types of sexism, validated against expert annotation. We also release a structured version of the Hansard Corpus optimized for computational social science, comprising 6.7 million speeches across 1.2 million debates, with 89\% gender coverage for MPs in the House of Commons. Our analysis reveals that 88% of speeches opposing women’s representation contain sexist content, compared to only 11% of speeches in support. However, the two sides deploy fundamentally different forms of sexism: anti-suffrage rhetoric is overwhelmingly hostile and proscriptive, while pro-suffrage sexism is predominantly benevolent and paternalistic. Female MPs support women’s political rights at a rate of 95%, compared to 80% for male MPs, with this gap narrowing only after enfranchisement. Our findings provide large-scale, naturalistic evidence that hostile and benevolent sexism serve distinct rhetorical functions in political discourse. By combining LLM-based classification with established social-psychological theory, we demonstrate how computational methods can uncover the structure of reasoning in historical text, enabling the study of bias not just in what is said, but in how arguments are made across centuries of institutional debate.

Publication Date

4-29-2026

Document Type

Thesis

Student Type

Graduate

Degree Name

Artificial Intelligence (MS)

Department, Program, or Center

Information and Computing Studies

College

Golisano College of Computing and Information Sciences

Advisor

Ashique KhudaBukhsh

Advisor/Committee Member

Christopher Homan

Advisor/Committee Member

Evan Selinger

Recommended Citation

Sawkar, Mandira, "Two Centuries of Sexism in British Parliament: A Computational Analysis of Women’s Representation in the Hansard Corpus" (2026). Thesis. Rochester Institute of Technology. Accessed from
https://repository.rit.edu/theses/12556

Campus

RIT – Main Campus

Download

COinS

Theses

Two Centuries of Sexism in British Parliament: A Computational Analysis of Women’s Representation in the Hansard Corpus

Abstract

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Search

Browse

Author Corner

RIT Links

Theses

Two Centuries of Sexism in British Parliament: A Computational Analysis of Women’s Representation in the Hansard Corpus

Author

Abstract

Publication Date

Document Type

Student Type

Degree Name

Department, Program, or Center

College

Advisor

Advisor/Committee Member

Advisor/Committee Member

Recommended Citation

Campus

Share

Search

Browse

Author Corner

RIT Links