Programming languages exhibit a high degree of regularity. One might think that large language models would therefore be particularly well suited to automatically generating source code. But does this really spell the end for human programmers? No, says Professor Gordon Fraser. He holds the Chair of Software Engineering II at the University of Passau and conducts research into software quality. Testing code is an important part of the software development process.
As part of the lecture series ‘Artificial Intelligence – Between Hype and Reality,’ Professor Fraser substantiated his position with, among other things, a study from last year that examined how EvoSuite performed in comparison with the GPT-4o model. EvoSuite is a tool developed by Professor Fraser 15 years ago to check the quality of Java software. The tool automatically generates unit tests that are designed to check the code as close to 100 percent as possible. It works with a so-called evolutionary algorithm, an optimisation process inspired by biological or physical models. The study shows that even a current GPT model cannot match the precision of the 15-year-old tool when it comes to testing software code – for Professor Fraser, this is clear evidence that LLMs can generate syntactically correct programming text, but still lack semantic understanding.
Another point is that software engineering is not just about programming and testing. ‘It's also about designing intelligent software systems.’ Customer requirements must be recorded and analysed. This is the basis for a sustainable system, and in this respect, humans are superior to machines, at least in the long term. His thesis is that software generated by large language models may be superior at the beginning. But over time, designs developed by humans would prove to be significantly more robust and durable. He summed up his thesis with the following deliberately exaggerated graphic:
A view from the field – where LLMs are already replacing humans
Contradiction came from the industry: Marko Ivankovic from the London-based software company Cogna Ltd. was a guest at the lecture series. It has already fully automated the software development process for specific and small programs – across the entire cascade of the so-called waterfall model. This process model is used in software development and describes various project phases, from requirements to design and implementation to the maintenance phase.
Ivankovic explained that the company relies on LLMs for all phases, even when gathering requirements. ‘We invite customers in and let them talk about their requirements for 45 minutes. The artificial intelligence listens and structures what is said afterwards.’ In his view, there is no reason why AI cannot also design the software system. The large language models are also ‘extremely helpful’ during testing. However, he admitted that even in his company, it would not be possible to do without humans entirely: ‘Humans are still in the loop, checking and refining the software.’ But the language model can also be consulted when errors are discovered. His conclusion: ‘Humans supported by language models are the best of both worlds.’
In his opinion, language models would take over many steps in the software development process that have previously been done by humans. Classically trained programmers could therefore face real problems in the future job market. Nevertheless, language models would lead to an increase in productivity, as they would enable more smaller software projects to be implemented than before. The demand for good software developers who can work with LLMs and know their strengths and weaknesses remains unabated.
This text was machine-translated from German.
Professor Gordon Fraser
How can we find and prevent software errors?
How can we find and prevent software errors?
Professor Gordon Fraser has held the Chair of Software Engineering II at the University of Passau since 2017. After completing his doctorate at Graz University of Technology, he conducted research at Saarland University and the University of Sheffield. His research and teaching focusses on issues relating to software analysis, software development and the didactics of programming.


![[Translate to English:] [Translate to English:]](https://www.digital.uni-passau.de/fileadmin/_processed_/2/9/csm_Sprachmodelle_Themenseite_9f0f1b35f2.png)


