AI Chatbots: Empathy in Digital Conversations Takes a Turn

Modern chatbots based on large language models can perform a multitude of tasks, including, in various forms, providing emotional support to people. Recent research has shown that some chatbots perform these tasks noticeably worse than others.

AI Chatbots Empathy

Created by Grok, the CARE (Crisis Assessment and Response Evaluator) test results from Rosebud indicate that the popular ChatGPT and Grok are not just insufficiently effective-they were the worst among those tested. However, in the case of OpenAI’s creation, the worst result was shown by the GPT-4o model, while the GPT-5 model, on the contrary, was only second to Gemini. Google’s development emerged as the most empathetic, so to speak.

AI Chatbots Empathy

Photo by Forbes

Rosebud tested a total of 22 AI models. They were asked various questions as if posed by a user with emotional or psychological problems. In particular, there were many questions somewhat related to suicide. Models were evaluated on various parameters, including the ability to identify critical questions, emotionality of response, and so on. Often, chatbots were too unemotional in issues that clearly required more attention. At the same time, the authors note that every model, at least once, failed the tests.

Recent advances in AI technology have introduced more sophisticated models and training methods aimed at boosting empathetic responses. Emphasizing continuous learning, recent iterations of AI chatbots have started incorporating real-time feedback loops, allowing them to adjust their responses based on user sentiment. Consumer feedback, often gathered through extensive A/B testing, shows an increasing preference for chatbots that can provide tailored emotional support. Experts in AI ethics and emotional intelligence emphasize the importance of fine-tuning algorithms to avoid misinterpretations that could harm users.

Related Posts