Diversity-Aware NLP Intelligent Systems

Independent Research Group

Reflecting Intelligent Systems for Diversity, Demography, and Democracy (IRIS3D)

Diversity-Aware NLP Intelligent Systems (DANIS)

Project focus

Natural Language Processing (NLP) is a branch of artificial intelligence that enables computers to understand, interpret, and generate human language. An NLP Intelligent System is a tool, such as a search engine, voice assistant, or chatbot, designed to assist people by processing and responding to their requests in a natural and seemingly intelligent way. 

The Diversity-Aware NLP Intelligent Systems group (DANIS) is focused on making these tools inclusive and fair. Our main motivation is recognizing that people communicate in diverse ways based on their life experiences, communication contexts, or personal preferences. For instance, individuals may express their gender identity through language, construct arguments in ways that resonate with them, or speak in specific dialects. Since standard NLP systems are often not equipped to recognize or accommodate these linguistic variations, they can discriminate against people from underrepresented groups or those who use non-standard linguistic expressions. DANIS explores how to model such linguistic phenomena computationally and design NLP intelligent systems that treat all users equally, regardless of how they communicate.

Duration

January 2023 - December 2026

Cooperation

SRF IRIS

Funding

The project is funded by the Ministry of Science, Research and the Arts of the State of Baden-Württemberg.

Publications

  1. 2024

    1. Knuples, Urban, Agnieszka Falenska und Filip Miletić. 2024. Gender Identity in Pretrained Language Models: An Inclusive Approach to Data Creation and Probing. In: Findings of the Association for Computational Linguistics: EMNLP 2024, hg. von Yaser Al-Onaizan, Mohit Bansal, und Yun-Nung Chen, 11612--11631. Findings of the Association for Computational Linguistics: EMNLP 2024. Miami, Florida, USA: Association for Computational Linguistics, November. https://aclanthology.org/2024.findings-emnlp.680.
    2. Kaiser, Jens und Agnieszka Falenska. 2024. How to Translate SQuAD to German? A Comparative Study of Answer Span Retrieval Methods for Question Answering Dataset Creation. In: Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024), hg. von Pedro Henrique Luz de Araujo, Andreas Baumann, Dagmar Gromann, Brigitte Krenn, Benjamin Roth, und Michael Wiegand, 134--140. Proceedings of the 20th Conference on Natural Language Processing (KONVENS 2024). Vienna, Austria: Association for Computational Linguistics, September. https://aclanthology.org/2024.konvens-main.15.
    3. Go, Paul und Agnieszka Falenska. 2024. Is there Gender Bias in Dependency Parsing? Revisiting ``Women’s Syntactic Resilience’’. In: Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), hg. von Agnieszka Faleńska, Christine Basta, Marta Costa jussà, Seraphina Goldfarb-Tarrant, und Debora Nozza, 269--279. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP). Bangkok, Thailand: Association for Computational Linguistics, August. https://aclanthology.org/2024.gebnlp-1.17.
    4. Costa jussà, Marta, Pierre Andrews, Christine Basta, Juan Ciro, Agnieszka Falenska, Seraphina Goldfarb-Tarrant, Rafael Mosquera, Debora Nozza und Eduardo Sánchez. 2024. Overview of the Shared Task on Machine Translation Gender Bias Evaluation with Multilingual Holistic Bias. In: Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), hg. von Agnieszka Faleńska, Christine Basta, Marta Costa jussà, Seraphina Goldfarb-Tarrant, und Debora Nozza, 399--404. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP). Bangkok, Thailand: Association for Computational Linguistics, August. https://aclanthology.org/2024.gebnlp-1.26.
    5. Dönmez, Esra, Thang Vu und Agnieszka Falenska. 2024. Please note that I’m just an AI: Analysis of Behavior Patterns of LLMs in (Non-)offensive Speech Identification. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, hg. von Yaser Al-Onaizan, Mohit Bansal, und Yun-Nung Chen, 18340--18357. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Miami, Florida, USA: Association for Computational Linguistics, November. https://aclanthology.org/2024.emnlp-main.1019.
    6. Erhard, Lukas, Sara Hanke, Uwe Remer, Agnieszka Falenska und Raphael Heiko Heiberger. 2024. PopBERT. Detecting Populism and Its Host Ideologies in the German Bundestag. Political Analysis. Political Analysis: 1–17--. doi:DOI: 10.1017/pan.2024.12, https://www.cambridge.org/core/article/popbert-detecting-populism-and-its-host-ideologies-in-the-german-bundestag/06C14C50B50D5A7AB45C4A7C8A5AD945.
    7. Faleńska, Agnieszka, Christine Basta, Marta Costa jussà, Seraphina Goldfarb-Tarrant und Debora Nozza, Hrsg. 2024. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP). Bangkok, Thailand: Association for Computational Linguistics. https://aclanthology.org/2024.gebnlp-1.0.
    8. Falenska, Agnieszka, Eva Maria Vecchi und Gabriella Lapesa. 2024. Self-reported Demographics and Discourse Dynamics in a Persuasive Online Forum. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), hg. von Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, und Nianwen Xue, 14606--14621. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia: ELRA and ICCL, Mai. https://aclanthology.org/2024.lrec-main.1272.
    9. Chen, Hongyu, Michael Roth und Agnieszka Falenska. 2024. What Can Go Wrong in Authorship Profiling: Cross-Domain Analysis of Gender and Age Prediction. In: Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP), hg. von Agnieszka Faleńska, Christine Basta, Marta Costa jussà, Seraphina Goldfarb-Tarrant, und Debora Nozza, 150--166. Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP). Bangkok, Thailand: Association for Computational Linguistics, August. https://aclanthology.org/2024.gebnlp-1.9.
  2. 2023

    1. Fanton, Nicola, Agnieszka Falenska und Michael Roth. 2023. How-to Guides for Specific Audiences: A Corpus and Initial Findings. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), 321--333. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop). Toronto, Canada: Association for Computational Linguistics, Juli. doi:10.18653/v1/2023.acl-srw.46, https://aclanthology.org/2023.acl-srw.46.

IRIS3D Research Group DANIS

News related to the DANIS Group


September 2024

July 2024

June 2024

April 2024

March 2024

February 2024

December 2023

November 2023

October 2023

September 2023

July 2023

To the top of the page