Skip to main content

Making transcribed manuscript data available for research

Making transcribed manuscript data available for research

Automatically transcribed manuscripts are not 100 percent accurate. Professor Malte Rehbein, Chair of Computational Humanities, and Professor Alexander Werth, Chair of German Linguistics, are investigating the extent to which good science is possible even with inaccurate data in a project funded by the Volkswagen Foundation.

The project ‘Methodology of the Inaccurate’ deals with the central question in scientific theory and methodology of the extent to which good science is possible even with faulty data. In the project, automatically transcribed historical manuscripts (council minutes from the 17th to 19th centuries) with an accuracy value of approximately 90 percent are evaluated according to various linguistic and historical questions, and these evaluations are compared with manually transcribed data from the same text sources with an accuracy value of 100 percent.

‘A key objective of the project is to make automatically transcribed manuscript data usable for research,’ explains Professor Alexander Werth. ‘In historical graphemics in particular, i.e. the study of letter writing and punctuation, we in linguistics sometimes rely on large data sets that are very difficult and time-consuming to obtain with manually transcribed data.’

For Professor Malte Rehbein, what makes the project particularly special is that ‘the Volkswagen Foundation is giving us the opportunity to test an innovative idea on large amounts of data. There is a risk that we will fail, but there is also a great opportunity to discover something truly new.’

Thematically, the project is also linked to the International Centre for Scholarship and Science ‘Methodikum’, founded by the Passau chairs of Multilingual Computational Linguistics (Professor Johann-Mattis List), Computational Humanities (Professor Malte Rehbein) and German Linguistics (Professor Alexander Werth). which aims to conduct basic methodological research in the humanities and serves as a point of contact for all questions relating to computer-assisted and digital methodology.

The Volkswagen Foundation is funding the project from 2025 to 2027 as part of its funding initiative ‘Aufbruch – New Research Spaces for the Humanities and Cultural Studies’. With this programme, the Volkswagen Foundation supports projects with a pioneering character that not only offer new perspectives on already known research topics, but also explore entirely new areas and topics of research.

The image shows a handwritten transcript from 10 January 1780 by the (Oberennsian) Spiritual Council of the Diocese of Passau. Photo: Archive of the Diocese of Passau

This text was machine-translated from German.

Principal Investigator(s) at the University Prof. Dr. Alexander Werth (Lehrstuhl für Deutsche Sprachwissenschaft), Prof. Dr. Malte Rehbein (Lehrstuhl für Computational Humanities)
Project period 01.10.2025 - 31.03.2027
Source of funding
VolkswagenStiftung > Volkswagenstiftung - Aufbruch
VolkswagenStiftung > Volkswagenstiftung - Aufbruch
Bluesky

Playing the video will send your IP address to an external server.

Show video