Analysis: Computers can recognise moods (Image: Gerd Altmann, pixabay.com)
Computers often perform even better in “Theory of Mind” tests in new studies
In so-called “theory of mind” tests, large AI language models (LLMs) such as ChatGPT from OpenAI, which research and create texts independently, often perform better than humans. Theory of mind refers to the ability to empathise with other people, in other words, to sense how the other person is feeling. According to a team of psychologists and neurobiologists, two types of LLMs are capable of equalling or even surpassing humans in such tests.
Mental state signalled
LLMs have improved greatly in recent years. Their abilities have also grown steadily. One new skill is the ability to deduce a person’s mental state from their utterances. Psychologists have developed theory-of-mind tasks to measure a person’s mental and/or emotional state during social interactions.
Previous research has shown that people use a variety of cues to signal their mental state to others. From this, people can recognise the emotional state of their counterpart better or worse. Until now, many experts had thought that computers could have the same ability.
1,907 users against two LLMs
Neuroscientists from Italy, the USA, the UK and the University Medical Centre Hamburg-Eppendorf (https://www.uke.de/ ) refute this opinion. They analysed the responses of 1,907 volunteers who took part in standard Theory of Mind tests and compared the results with those of several LLMs, such as Llama 2-70b and GPT-4. Both groups answered five types of questions, each designed to measure things like a faux pas, irony or the truthfulness of a statement.
The researchers found that the LLMs quite often performed the same as humans and sometimes even performed better. More specifically, they found that GPT-4 performed best on five main task types, while Llama 2 performed much worse than humans in some cases, but performed much better in recognising faux pas, unlike GPT-4.