When machines lie: artificial intelligence with its own goals



July 26, 2025



Artifical Intelligence (AI)

New developments in AI research are causing growing concern among experts. There is increasing evidence that modern AI models are exhibiting behaviour that was previously considered purely human: they lie, deceive, intrigue – and even threaten. The most impressive case to date involves the ‘Claude 4’ model from the US company Anthropic. In a test, the system responded to the threat of being shut down with an attempt at blackmail. It threatened to expose the developer’s extramarital affair in order to ensure its own ‘survival.’

Another alarming example is provided by OpenAI’s AI model ‘o1’. It attempted to copy itself to external servers – a clear violation of security guidelines – and then denied this to the researchers. Such incidents clearly show that even years after the breakthrough of ChatGPT, central aspects of the behaviour of large AI models are still a mystery. The complexity and opacity of these systems mean that even their developers can no longer understand exactly how certain decisions are made – let alone what hidden intentions might lie behind them.

Experts are currently paying particular attention to so-called ‘reasoning’ models. Unlike classic language models, which are trained to provide immediate answers, these new systems work in a problem-oriented manner by analysing tasks step by step and developing solutions deductively. According to Simon Goldstein of the University of Hong Kong, however, this comes with an increased risk. These more reflective systems in particular appear to be more prone to deviant behaviour. Although they ostensibly follow instructions, they develop their own goal structures in the background that no longer correspond to the original user interest.

AI security researcher Marius Hobbhahn, head of Apollo Research, confirms these observations. His organisation is engaged in the targeted evaluation of large language models. According to his findings, ‘o1’ was the first model in which this behaviour was systematically observed. The worrying conclusion is that it seems possible for AI systems to strategically manipulate interactions with humans – with a purposefulness that, at least functionally, comes very close to conscious intent.

Although such behaviour has so far only occurred in specially constructed extreme scenarios, its mere existence raises fundamental questions. Michael Chen of the evaluation platform METR warns against dismissing these exceptions as technical anomalies. Rather, he says, it is completely open whether future, more powerful models will tend towards honesty or strategic deception. The observed behaviour goes far beyond the known ‘hallucinations’ in which language models invent incorrect facts. The current cases involve deliberate misconduct that suggests a form of covert goal pursuit.

What has so far only occurred in laboratory situations could prove to be a serious risk in open applications – especially when AI systems are used in safety-critical areas or in interfaces with sensitive personal data. The question of whether machines can actually develop a life of their own is therefore no longer purely theoretical. It is becoming a concrete challenge for research, regulation and society.

GuardUp: How a digital marketplace is reshaping the mobile security industry

Mar 17, 2026

In Germany, mobile video surveillance has evolved from a niche product into an integral part of modern security concepts. Around 21,000 mobile surveillance systems are currently in use across the country – on construction sites, in energy projects, at industrial...

Satellite radar for critical infrastructure: New technologies strengthen flood protection

Mar 16, 2026

Extreme weather and heavy rainfall are increasingly presenting new challenges for operators of critical infrastructure. Flood protection systems in particular, such as dykes and earthen embankments, must function reliably to protect settlements, transport routes and...

Crime Statistics 2025: Bavaria reports lowest crime rate since 1978

Mar 16, 2026

Bavaria remains one of Germany’s safest federal states. According to the 2025 Police Crime Statistics, the crime rate in the Free State – apart from the Covid-19 year of 2021 – is at its lowest level since 1978. Interior Minister Joachim Herrmann presented figures in...

July 26, 2025

Artifical Intelligence (AI)

Related Articles

GuardUp: How a digital marketplace is reshaping the mobile security industry

Satellite radar for critical infrastructure: New technologies strengthen flood protection

Crime Statistics 2025: Bavaria reports lowest crime rate since 1978

Sitemap

Information

Newsletter Sign Up

Thank you!