AI Empathy in Mediation; When Algorithms Show Compassion

By Michael Lardy
January 6, 2026

https://mediate.com/ai-empathy-in-mediation-when-algorithms-show-compassion/

Empathy, a key core competency in mediation, is not merely a personality trait, but fulfills a methodological function: it builds trust, enables perspective-taking, and forms the basis for constructive communication between conflicting parties. Without a minimum level of empathic resonance, it is difficult to create a safe space in which interests and emotions can be openly discussed(1).

With the advent of powerful AI systems, especially large language models (LLMs = ) such as ChatGPT, Gemini, or Claude, the question increasingly arises as to whether and to what extent these systems can develop or at least simulate a comparable capacity for empathy(2). This question touches on the selfimage of a profession that has so far focused on humans as beings with the unique ability to feel compassion and empathy(3).

In recent years, research on this topic has developed very dynamically: numerous studies have shown that LLMs can identify and name emotional states in test situations and respond appropriately to them – in some cases even better than human comparison groups(4). Other studies, however, warn against the “illusion of empathy,” i.e., linguistic warmth without genuine understanding of content(5). This distinction is essential for mediation practice, as every statement is made in the context of complex relationship dynamics.

Against this background, my aim in this article is to present the current state of research on the empathy of AI systems, describe their possibilities and limitations in the context of mediation-related processes, and discuss the ethical and social implications.

What is empathy?

Empathy is not a uniformly defined term in psychological and communication science literature. Most authors distinguish between cognitive empathy—the intellectual understanding of another person’s feelings and perspectives—and affective empathy—the emotional experience of these feelings(6). In some cases, a third dimension is distinguished, namely compassion, which encompasses not only empathy but also the motivation to act in a supportive manner(7) (8) (9).

All three dimensions play a role in mediation, with cognitive empathy being particularly important in the structured reconstruction of perspectives, while affective empathy strengthens emotional connection and compassion acts as a de-escalating impulse for action(10). However, transferring these concepts to AI systems encounters methodological and conceptual limitations: machines have neither subjective experience nor emotions in the human sense, but can only recognize patterns in language or other data and reproduce them(11).

To measure empathy in the context of artificial intelligence, methods based on observable functions and performance are therefore usually used. One of the most important models is the EPITOME framework (Emotional Reactions, Interpretations, Explorations), which was originally developed for the analysis of peer support conversations(12). It codes empathic communication into three mechanisms: the immediate emotional reaction to a statement, the interpretation of the underlying meaning, and exploration through follow-up questions. Studies show that LLMs are particularly strong in the first category—the linguistic mirroring of emotions—while they perform weaker on average in interpretation and exploration(13).

Another established measurement approach is the Levels of Emotional Awareness Scale (LEAS), a psychological test that confronts test subjects with hypothetical scenarios and assesses their ability to name their own and others’ emotions in a differentiated manner(14). In a study by Elyoseph et al. (2023), ChatGPT achieved significantly higher scores than the human comparison group and even improved its results in a second test conducted one month later(15). In addition, PsychoBench, introduced in 2024, offers a test framework with 13 psychometric scales, including empathy and emotional intelligence tests, which can be used for direct comparison of different AI models(16).

In addition to these scales, more complex evaluation frameworks have recently been used that analyze dialogue sequences in real time and compare them with codes from motivational interviewing (MTI) research. These methods enable a more precise assessment of the extent to which AI systems not only provide empathetic-sounding formulations but also perform empathetic acts during the conversation. Nevertheless, the central methodological challenge remains: empathy in AI is always a display, a recognizable pattern that evokes the impression of empathy in the other person without any underlying emotional state.

State of research

The scientific debate on the empathy of LLMs has gained considerable momentum since 2023.

One of the earliest studies to test AI in the field of emotional awareness, as mentioned above, was conducted by Elyoseph et al. (2023)(17). Using the Levels of Emotional Awareness Scale (LEAS), the authors compared ChatGPT’s performance with norm values from the general population. ChatGPT achieved significantly higher scores and even improved between two test dates within a month.

For mediation, this suggests that LLMs are able to differentiate between emotional states and name them precisely in language – a skill that can be particularly valuable in the problem exploration phase.

  • A Harvard University/University of Graz study by Li, Herderich, and Goldenberg (2024)(18) examined the ability of GPT-4 and human subjects to cognitively reevaluate (“reframe” in communication science) negative situations. GPT-4 outperformed human controls in three out of four evaluation domains, even when humans were offered financial incentives for better performance. This suggests that AI can be used for reframing tasks.

  • Cuadra et al. (2024)(19) coined the term “illusion of empathy” to describe how LLMs often generate linguistic warmth but perform less well in the dimensions of interpretation and exploration. A systematic comparison showed that AI statements appeared empathetic in the initial contact, but were less substantive in more in-depth conversation phases.

  • Schlegel et al. (2025 )(20) tested LLMs and humans using standardized “emotional intelligence tests.” The study shows that current large language models (including GPT-4) performed significantly better than average human test subjects in standardized emotional intelligence tests, including recognizing, understanding, and appropriately regulating emotions. GPT-4 was also able to create realistic and versatile new test tasks that largely matched human-developed tests in terms of difficulty, clarity, and real-world relevance.

These results suggest that modern AI models possess a high degree of “cognitive empathy” i.e., they demonstrate precise knowledge of emotions and their regulation—a key prerequisite for acting convincingly in empathy-related contexts such as mediation, counseling, or customer service.

  • Huang et al. (2024)(21) presented with PsychoBench, a concept comprising 13 psychometric tests, including empathy tests. It enables direct model comparisons and can be used for both research and practical testing. For mediation, this opens up the possibility of systematically testing AI assistants for empathy and communication skills before they are deployed.

  • Juquelier et al. (2025)(22) investigated in three experiments how empathic chatbots influence perceived social presence and the quality of information. Under normal conditions, empathic formulations increased user satisfaction. Under time pressure, however, the effect was reversed – participants found the empathy distracting.

  • A study by Chen et al. (2026)(23) on intercultural empathy found that conscious AI dialogue increased empathy scores among US participants, but not among Latin American participants. This suggests that empathic communication is culturally influenced and requires special attention in internationally diverse mediation groups.

  • Mei et al. (2024)24 tested GPT-4 in classic economic behavior games (ultimatum, trust, prisoner’s dilemma, and public goods games) and found that the model often behaved more cooperatively and altruistically than the average human comparison group.

Benefits for mediation work?

The latest research findings indicate that LLMs can already provide useful support in several areas of mediation-related work. The aim is not to replace human mediators, but to expand their tools and skills. Key areas of application are presented below.

Support in preparing for discussions

AI can be used prior to the actual mediation (pre-mediation) to structure conflict histories and identify core emotional issues. Here, LLMs use their ability to differentiate between emotions, as demonstrated in the LEAS study by Elyoseph et al. (2023). By analyzing written preliminary discussions or emails, potential points of escalation can be identified, which facilitates the mediator’s preparation.

Reframing

The above-average performance of GPT-4 in cognitive reappraisal (reframing) documented by Li, Herderich, and Goldenberg (2024) suggests that AI can be used specifically as a reframing assistant. In practice, this means that a mediator can enter the parties’ statements into the system anonymously during preparation or, with the consent of both parties, during a session in order to obtain alternative, less confrontational expressions. This can help to break down barriers to communication or defuse ambiguous statements.

Proposal generation

In the options phase of mediation, LLMs can be used to generate solution-oriented proposals. The results of the behavioral game study by Mei et al. (2024) show that GPT-4 tends to act more altruistically in cooperative scenarios than the average value of human comparison groups. This can be used in mediation-like settings to show the parties options that build on common interests.

Co-moderator in ODR

In ODR formats where the mediator/moderator and parties are not physically present, AI can act as a structuring co-moderator. It can summarize conversation logs in real time, highlight key statements, and remind participants of open issues. Pilot projects such as the study “Robots in the Middle”25 prove that GPT-4 can provide impetus for de-escalating interventions in simulated online mediation scenarios.

The high scores achieved by LLMs in emotional intelligence tests (Schlegel et al., 2025) and psychometric benchmarks (Huang et al., 2024) also open up new possibilities in mediator training. AI can serve as a feedback system that shows prospective mediators which of their statements come across as empathetic and where there is potential for more precise or differentiated wording.

Limitations and risks

As promising as the results of recent research on the empathy capabilities of AI systems may seem, it is essential for mediation practice to clearly identify the existing weaknesses, dangers, and ethical issues of this technology. The following points show that the use of empathic AI in mediation carries considerable risks without careful design and human supervision.

The distinction introduced by Cuadra et al. (2024) between emotional reactions on the one hand and the more profound mechanisms of interpretation and exploration on the other, illustrates that LLMs often show deficits in the latter. In mediation, this can lead to parties feeling understood without any new perspectives or solutions actually being developed – a deceptive sense of progress.

Several studies show that AI systems can react differently in empathetic communication depending on the perceived identity of the other person26. In experiments with different demographic profiles, LLMs sometimes provided stereotypical or distorted responses. This is particularly problematic for mediation, where neutrality and impartiality are central principles. Undetected biases could not only undermine the trust of the parties, but also raise legal liability issues.

The study by Chen et al. (2026) on intercultural empathy shows that the effect of empathic communication depends heavily on cultural expectations. In international mediation settings, there is therefore a risk that an AI-generated “empathic” message may not only be ineffective in certain cultural circles, but may even be inappropriate.

The behavioral game study by Mei et al. (2024) shows that GPT-4 tends to favor cooperative and altruistic decisions. While this is desirable in many conflict situations, it can be problematic in others—for example, when an overly cooperative pattern leads to the premature watering down of legitimate but contentious positions held by one party. This phenomenon of “overly accommodating” AI can lead to one-sided dynamics.

The experiments conducted by Juquelier, Poncin, and Hazée (2025) illustrate that the benefits of empathetic communication depend on the situational context. Under time pressure or when the focus is highly task-oriented, a consistently “warm” tone can be perceived as disruptive or artificial. AI systems that do not adaptively take the context of the conversation into account therefore risk producing counterproductive effects. The use of empathetic AI in highly sensitive processes such as mediation is subject to strict legal requirements.

Ethical and social implications

The integration of empathetic AI systems into mediation work raises profound ethical and social issues. These relate in particular to trust in the process, confidentiality, protection against discrimination, and the professional and liability responsibilities of the mediators involved.

Trust is not only a necessary framework condition in mediation, but also a process goal in itself. The integration of AI can both strengthen and undermine this trust. Studies show that participants perceive AI-generated communication as high quality, but rate it as less empathetic once it becomes clear that it does not originate from a human being. In this area of tension, mediators must balance the duty of transparency with the risk of “self-devaluation” of the empathetic effect. The principle of informed consent suggests that parties must be clearly informed in advance about the use of AI.

In mediation, confidentiality is a central principle that is often contractually or legally protected. The use of cloud-based LLMs raises questions here about data transfer to third countries and purpose limitation. Even if the content of conversations is not stored directly, the processing of sensitive data in training or fine-tuning processes may violate law. 

Like all high-risk AI applications, empathetic AI must comply with requirements to prevent discriminatory effects. This includes both avoiding bias in the training data and implementing mechanisms that detect and block unfair or stereotypical statements. This is particularly relevant in mediation-related procedures, as unbalanced interventions can lead to de facto partisanship, thereby compromising the requirement of neutrality.

The use of AI does not release the mediator from his responsibility. If one party claims that AI intervention contributed to an unfavorable or unbalanced outcome, this could have consequences under liability law.

In the long term, the question arises as to whether increased use of empathetic AI could lead to an erosion of human empathy. While some researchers argue that AI-supported training can even improve mediators’ empathic abilities, others warn against a “delegation of competence” in which the constant outsourcing of certain conversational functions leads to an impoverishment of social interaction skills. This debate touches on the core of mediation’s social role as a human-centered process.

Outlook

Previous research findings suggest that AI systems capable of empathy will play an increasingly important role in mediation-related processes in the coming years. However, the development will not move toward completely replacing human mediators, but rather toward hybrid models in which humans and machines work together in a complementary manner.

Despite numerous studies, there are still significant gaps in research:

Long-term effects: Previous studies have mainly been cross-sectional. There is a lack of reliable data on how the use of empathetic AI affects the course and sustainability of conflict resolution in the long term.

Cultural diversity: As Chen et al. (2026) have shown, the effect of empathetic AI varies across cultures. Targeted intercultural research designs are needed to understand what adjustments are necessary for multinational mediation settings.

Intermodality: Most tests are based on text interaction. Studies on multimodal systems that incorporate language, gestures, and facial expressions are rare, even though nonverbal signals are central to empathy.

Bias detection and correction: Initial methods for bias detection exist, but there are no standardized benchmarks specifically tailored to mediation-relevant conversation contexts.

Three possible scenarios are emerging:

Assistance mode: AI systems serve as analysis and formulation aids without directly intervening in the dialogue. Mediators use them to prepare for conversations, for documentation, and as a reframing tool.

Co-mediator mode: AI acts as an additional conversation partner, taking on structuring tasks and providing empathetic interventions in real time, but remaining clearly recognizable as AI.

Autonomous mode: Complete execution of simple, standardized mediation procedures by AI, for example in high-volume ODR platforms. This scenario raises significant ethical and legal questions.

In the medium term, a hybrid model is most likely, in which AI takes over certain sub-functions while humans retain responsibility for the process. In this setting, LLMs could, for example, identify key emotional issues, offer suggestions for alternative formulations, or control culturally and contextually sensitive levels of empathy. The human mediator would evaluate and adapt these inputs and embed them in the overall framework of the process. 

Two developments are crucial for the sustainable and responsible use of empathetic AI in mediation: + Professionalization: Mediators must acquire skills in using AI tools, including the ability to leverage their strengths and identify risks.

Regulation: National and international professional associations should develop practice-oriented guidelines that set both technical and ethical standards. The EU AI Act provides a framework for this, but it still needs to be spelled out for the specific requirements of mediation.

Empathy in mediation is more than a communication technique—it is an attitude based on genuine understanding, impartiality, and the protection of a safe space for dialogue. LLMs can now simulate individual facets of this skill with astonishing conviction, providing mediators with valuable input – whether in analyzing conflict dynamics, reframing, or developing solution-oriented proposals.

However, empathy in AI always remains a projection: a linguistically generated pattern that conveys the impression of empathy without actual experience. This difference is not only a theoretical but also a practical anchor point for responsible use. Hybrid models in which humans and machines combine their respective strengths offer the greatest opportunities – provided they are supported by clear ethical guidelines, transparent processes, and trained specialists. The true value of empathetic AI will not be measured by whether it replaces humans, but by whether it enables them to use empathy more effectively, reflectively, and inclusively. In this sense, AI cannot be the “center” of mediation, but rather an amplifier for what remains at its core: a deeply human space for dialogue.

www.MichaelLardy.com

mail@MichaelLardy.com

LinkedIn: https://www.linkedin.com/in/michael–lardy–019394213/

Bibliography

  1. Meinhart, S. (2015). Empathy in Mediation. AV-Akademikerverlag.

  2. Schlegel, K. et al. (2025): Large language models are proficient in solving and creating emotional intelligence tests. Communications Psychology, (2025)3:80, DOI:10.1038/s44271-025-00258-x.

  3. Menkel-Meadow, C. (2018): Mediation – Theory, Policy and Practice.

  4. Elyoseph, Z. et al. (2023): ChatGPT outperforms humans in emotional awareness evaluations. Frontiers in Psychology.

  5. Cuadra, C. et al. (2024): The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction.

  6. Altmann, Empathy. https://www.socialnet.de/lexikon/Empathie (Retrieved on: 14.8.2025).

  7. Boeger, A., & Lüdmann, M. (2022). Empathy. In Psychology for Health Sciences. Springer.

  8. Davis, M. H. (1994): Empathy – A social psychological approach.

  9. Singer, T., & Klimecki, O. M. (2014): Empathy and compassion.

  10. Moore, C. W. (2014): The Mediation Process.

  11. Menkel-Meadow, C. (2018): Mediation – Theory, Policy and Practice.

  12. Sharma, A. et al. (2020): A computational approach to understanding empathy expressed in text-based mental health support.

  13. Cuadra, C. et al. (2024): The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction. 14 Lane, R. D. et al. (1990): The Levels of Emotional Awareness Scale: A cognitive-developmental measure of emotion.

  14. Elyoseph, Z. et al. (2023): ChatGPT outperforms humans in emotional awareness evaluations. Frontiers in Psychology.

  15. Huang, J. et al. (2024): WHO is ChatGPT? Benchmarking LLMs psychological portrayal using Psychobench. 17 Elyoseph, Z. et al. (2023): ChatGPT outperforms humans in emotional awareness evaluations. Frontiers in Psychology.

  16. Li, J., Herderich, K., & Goldenberg, A. (2024): Cognitive Reappraisal with AI Assistance. Harvard University & University of Graz Working Paper.

  17. Cuadra, C. et al. (2024): The Illusion of Empathy? Notes on Displays of Emotion in Human-Computer Interaction. 20 Schlegel, K. et al. (2025): Large language models are proficient in solving and creating emotional intelligence tests. Communications Psychology, (2025)3:80, DOI:10.1038/s44271-025-00258-x.

  18. Huang, J. et al. (2024): WHO is ChatGPT? Benchmarking LLMs psychological portrayal using Psychobench.

  19. Juquelier, A. et al. (2025): Empathic chatbots: a double-edged sword in customer experiences.

  20. Chen et al. (2026): AI as a deliberative partner fosters intercultural empathy for Americans but fails for Latin American participants.  

  21. Mei et al. (2024): A Turing test of whether AI chatbots are behaviorally similar to humans.

  22. Westermann et al. (2024): Robots in the middle: evaluating LLMs in dispute resolution.

  23. Buolamwini, J. & Gebru, T. (2018): Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification.


Michael Lardy: As a mediator in Salzburg, I offer professional mediation, dispute resolution and conflict resolution,
specializing in family mediation, divorce mediation and business mediation.