What Can Go Wrong in Authorship Profiling for Demographic Prediction: A Systematic Error Analysis of Model Exclusion

September 1, 2023 / IRIS3D

Hongyu Chen

Hongyu Chen will present her poster at the 3rd Workshop on Computational Linguistics for the Political and Social Sciences.

Authorship Profiling (AP) has become a prominent task in recent years, aiming to identify an author's demographic characteristics through their writing style. While text categorization using stylometric features has shown high accuracy in predicting demographic attributes (eg. gender and age) in certain tasks, there are still cases where ML models and state-of-the-art models do not perform well. This brings the potential risks of marginalizing and misrepresenting certain demographic groups, ultimately giving rise to biases and discrimination. Thus, to gain a comprehensive understanding of the extent of sub-optimal models might exclude authors in certain demographic groups, this paper aims to shed light on the models' exclusion behavior in AP task through a systematic error analysis.

LINK TO THE PRESENTATION

Contact

Hongyu Chen

To the top of the page