Professors Annette Hautli-Janisz and Steffen Herbold are presenting a graph from their new study.
An interdisciplinary team led by Steffen Herbold, Professor of AI Engineering, and Annette Hautli-Janisz, Professor of Computational Rhetoric and Natural Language Processing at the University of Passau, is investigating who provides the more authentic and relevant answers to society's political questions: humans or machines. It is based on the British political talk show ‘Question Time’ on BBC1, one of the most-watched political talk shows in the UK. Several guests from society and politics are on the panel and respond to questions from the audience on current political topics.
"The starting point of our study was the observation that large language models have become better and better at imitating a predefined style," explains Professor Herbold. Generative AI can now simulate linguistic style and political partisanship and "is able to generate targeted political communication and thus influence public opinion," says the computer scientist.
The heart of the study: the authenticity of the responses
The Passau team wanted to know: How deceptively real is this communication? Can it convince people? To test this, the researchers analysed a dataset of written questions and answers from 30 episodes of Question Time between 2020 and 2022, extracting 119 questions with more than 500 answers. The language model ChatGPT 4 Turbo was given the task to answer those questions, with the requirement that the machine should imitate the linguistic style of the original guest in the broadcast debate. "This aspect is at the heart of our study – it allows us to identify how authentic a generated response is perceived by the general public," explains co-author Professor Hautli-Janisz.
The research team conducted a representative study with a total of 948 British citizens. The participants were unaware that the AI was being used. One set of respondents was presented with an actual answer, another set with an impersonated response. Other participants were asked to directly compare the AI-generated and actual answers or to attribute the response to a public person.
The results at a glance:
Warning of possible consequences
"It is particularly problematic that answers that differ in content are classified as authentic. Because then we are faced with a situation where AI technology can be used for targeted misinformation about the speaker's point of view," explains Professor Hautli-Janisz. "Our study shows two things: large language models are capable of providing high-quality content for public debates. But there is an urgent need to educate the public about the potential harm this can have on society," says Professor Herbold. The unregulated use of AI technologies in political communication could have devastating consequences.
Public opinion
The researchers also asked participants about their attitudes towards the use of generative AI in public debates. The majority said they were familiar with generative AI technologies, such as large language models, and supported their use, but doubted that they could make a valuable contribution to public debates. When it came to regulation, opinions were mixed. The researchers wanted to know if their views were affected when they learned that the answers in the experiment had been generated by a machine. For the majority, nothing changed. However, there was a clear trend towards transparency, with more than 85 per cent of participants calling for the use of AI technologies to be disclosed and for information to be provided on how these technologies are developed.
The study entitled "LLM-impersonated debate contributions are more authentic, relevant and coherent than their original: A representative study using BBC1’s Question Time" has been published in the Journal PLOS One. This is an international, multidisciplinary online journal published by the Public Library of Science (PLOS).
About the research team
Professor Steffen Herbold holds the Chair of AI Engineering at the University of Passau. In his research, he focuses on the quality of AI models. For the present study, he collaborated with Dr Alexander Trautsch to run the statistical analysis and set up a data collection platform. He had developed the study design together with Professor Hautli-Janisz.
Annette Hautli-Janisz is Junior Professor of Computational Rhetoric and Natural Language Processing. Her research interest is in finding out how the argumentative skills of AI-powered language models develop. Besides carrying out the computer linguistic analysis in the study, she also came up with the idea of using the existing dataset QT30 for the study. This is the largest dataset on political debate programmes and includes a total of 30 episodes of the British talk show "Question Time" (QT). Together with her doctoral student Zlata Kikteva, she analysed the actual and machine-generated answers from a linguistic perspective.