Skip to main content

Computer science doctoral student from University of Passau wins first place at international database conference

At this year's "International Conference on Management of Data“ (SIGMOD) in Seattle, Stefan Klessinger, a doctoral student in computer science from the University of Passau, came first in the "Student Research Competition", a forum that provides doctoral students the opportunity to present their ongoing research. The object of his research is to recognise structures in data and thus to improve the quality of the data in a dataset. By Nicola Jacobi

Symbol picture: Colourbox.

“In times of 'big data' and mass data processing, data quality is a major issue," says Klessinger. His research focus is on the automatic detection of dependencies in semi-structured data. These dependencies can be used to describe the structure of data more accurately than previous approaches (in what is called a "scheme"). A more precise scheme is also capable of facilitating the work of data consumers (software developers, for example) by giving them a more accurate idea of what the data look like.

If your description of the dependencies or the structure of processed data is too narrow, new data that are actually valid may be identified as faulty. However, if the description lacks precision, data that are actually faulty will not be recognised as such

Stefan Klessinger, Research Assistant at the Chair of Scalable Database Systems

Doktorand Stefan Klessinger.

Doctoral student Stefan Klessinger. Photograph: Christian Haasz (werbeFOTO HAASZ)

Since October 2021, Klessinger has been working in both international and national teams at the Chair of Scalable Database Systems held by Professor Stefanie Scherzinger who herself researches semi-structured data. "On account of the diversity in these research groups, there are lots of exciting ideas," says Klessinger, who has been working on his current research topic for about one year now. "This has given rise to various points of departure. The discussions in the teams, and also during international conference trips, are inspiring and motivating."

His research focus combines two thematic areas which have so far been researched independently of each other for the most part: automatic structure recognition in semi-structured data, on the one hand, and automatic detection of dependencies (on structured data), on the other. He explains that a major difficulty in both subject areas is that the structure or the dependencies need to be adequately described but not in too great a detail. Frequently, automatically detected dependencies are only randomly valid for the data under consideration and can be compromised by including additional data. Likewise, the structure of different data from the same dataset may vary, which often provides a strong incentive to draw up a meaningful abstraction of the identified structure "If your description of the dependencies or the structure of processed data is too narrow, new data that are actually valid may be identified as faulty. However, if the description lacks precision, data that are actually faulty will not be recognised as such."

An example to illustrate this point

A dataset describes people using what are called "attributes", including first name, second name, surname, date of birth and generation. Current approaches are focused on recognising that the "second name", for instance, is not always available or that the "date of birth" is a number whereas the other attributes consist of a character string of letters. Klessinger's research is about formulating more precise descriptions using what are called “dependencies”. If the date of birth is shown as 2000, for instance, such a level of precision would make clear that the data relate to "generation Z".

This research earned Klessinger first place in the "Student Research Competition" at this year's 'International Conference on Management of Data" (SIGMOD) , which took place in Seattle in June and is regarded as one of the most important international conferences on databases. Chair holder Professor Stefanie Scherzinger expressed her delight: "It's the second time in a row that a staff member of the Chair makes it to the final round of the ACM SIGMOD Student Research Contest. I am really very pleased that Mr Klessinger won the competition this year."

About Stefan Klessinger

Stefan Klessinger has been studying at the University of Passau since 2013. After earning his bachelor's degree in internet computing in 2019, he completed his master's degree in computer science. He has been working as research assistant at the Chair of Scalable Database Systems held by Professor Stefanie Scherzinger since October 2021.

Playing the video will send your IP address to an external server.

Show video