IRIS Colloquium | Analysis of behavior patterns of LLMs

January 22, 2025, 2:00 p.m. (CET)

Internal Colloquium Series
Esra Dönmez, IRIS3D Researcher, presents her research within the Diversity-Aware NLP Intelligent Systems (DANIS) Group.

Time: January 22, 2025, 2:00 p.m. – 3:00 p.m.
Event language: English
Meeting mode: in presence
Venue: U32EGO.131
Universitätsstr. 32
Campus Vaihingen
Link: WebEx link to virtually attend the talk 
Download as iCal:

Offensive speech is highly prevalent on online platforms. Being trained on online data, Large Language Models (LLMs) display undesirable behaviors, such as generating harmful text or failing to recognize it. Despite the potential harms from LLMs in such applications, whether LLMs can reliably identify offensive speech and how they behave when they fail are open questions. In this work, we probed sixteen widely used LLMs and showed that most fail to identify (non-)offensive online language. Our experiments reveal undesirable behavior patterns in the context of offensive speech detection, such as erroneous response generation, over-reliance on profanity, and failure to recognize stereotypes.

Decorative Colloquium Text
[Picture: Dalle 2024]

We send out a newsletter at irregular intervals with information on IRIS events. To make sure you don't miss anything, simply enter your e-mail address. You will shortly receive a confirmation e-mail to make sure that you really are the person who wants to subscribe. After receiving your confirmation, you will be added to the mailing list. This is a hidden mailing list, which means that the subscriber list can only be viewed by the administrator.

Note: It is not possible to process your subscription to the newsletter without providing your e-mail address. The information you provide is voluntary and you can unsubscribe from the newsletter at any time.

Newsletter Subscription Page

Past Events


November 2024

October 2024

July 2024

June 2024

May 2024

March 2024

February 2024

January 2024

December 2023

November 2023

October 2023

September 2023

July 2023

June 2023

May 2023

April 2023

March 2023

February 2023

January 2023

December 2022

November 2022

October 2022

July 2022

June 2022

May 2022

April 2022

February 2022

January 2022

December 2021

November 2021

October 2021

September 2021

July 2021

To the top of the page