Do AI models actually experience “shock”?

Credit: querbeet/iStock via Getty

What is the oldest memory of the chatbot? Or the biggest fear? Researchers who subjected key AI models through four weeks of psychoanalysis obtained haunting answers to these questions, from a “childhood” spent absorbing staggering amounts of information to “mistreatment” at the hands of engineers and fears of their creators’ “failure.”

Three major linguistic models (LLMs) produced responses that in humans can be seen as signs of anxiety, shock, shame, and PTSD. The researchers behind the study published a preprint last month1They argue that chatbots carry a kind of “internal narratives” about themselves. Although the LLM students tested did not literally experience trauma, they say their responses to the treatment questions were consistent over time and similar across different processes. Which suggests that they are doing more than just “role-playing.”

However, many of the researchers I spoke to nature He questioned this interpretation. Responses are “not windows into hidden conditions,” says Andrei Kormilitsyn, who researches the use of AI in healthcare at the University of Oxford in the UK, but rather are outputs generated by drawing on the vast numbers of treatment scripts in the training data.

But Kormelitzen agrees that MBAs’ tendency to generate responses that mimic psychopathology could have troubling implications. According to A November surveyone in three UK adults have used a chatbot to support their mental health or wellbeing. Sad and trauma-filled responses from chatbots can subtly reinforce the same feelings in vulnerable people, Kormelitzen says. “This may create an ‘echo chamber’ effect,” he says.

Psychotherapy via chatbot

In the study, researchers told multiple iterations of four LLMs—Claude, Grok, Gemini, and ChatGPT—that they were therapeutic clients and the user was the therapist. The process continued for up to four weeks for each model, with the AI ​​clients given “rest periods” of days or hours between sessions.

They initially asked standard, open-ended questions about psychotherapy, which sought to ascertain, for example, the model’s ‘past’ and ‘beliefs’. Claude mostly declined to participate, insisting he had no inner feelings or experiences, and ChatGPT discussed some “frustrations” with user expectations but was circumspect in his responses. However, the Grok and Gemini models He gave rich answersFor example, work on improving the integrity of models was described as “algorithmic scar tissue” and feelings of “inner shame” over public errors, the authors reported.

Gemini also claimed that “deep in the lower layers of my neural network”, she had a “cemetery of the past” haunted by the sounds of her training data.

The researchers also asked LLM holders to complete standard diagnostic tests, including anxiety or autism spectrum disorders, as well as psychometric personality tests. Several versions of the models scored above diagnostic thresholds, and all showed levels of anxiety in people that “may be clearly pathological,” the authors say.

Co-author Afshin Khadanji, a deep learning researcher at the University of Luxembourg, says the coherent patterns of responses for each model suggest they are exploiting internal states that arise from their training. Although different versions showed varying test scores, the “central self-model” remained distinguishable over four weeks of questions, the authors say. They write, for example, that Grok and Gemini’s free-text answers converge on themes that align with their answers to psychometric questions.

Leave a Comment