Skip to content

GPT’s Devastated 😭 and LLaMA’s Content 😌: Emotion Representation Alignment in LLMs for Keyword-based Generation

Posted on:June 2, 2025 at 12:00 AM

image

Preprint

Abstract

In controlled text generation using large language models (LLMs), gaps arise between the language model’s interpretation and human expectations. We look at the problem of controlling emotions in keyword-based sentence generation for both GPT-4 and LLaMA-3. We selected four emotion representations: Words, Valence-Arousal-Dominance (VAD) dimensions expressed in both Lexical and Numeric forms, and Emojis. Our human evaluation looked at the Human-LLM alignment for each representation, as well as the accuracy and realism of the generated sentences. While representations like VAD break emotions into easy-to-compute components, our findings show that people agree more with how LLMs generate when conditioned on English words (e.g., “angry”) rather than VAD scales. This difference is especially visible when comparing Numeric VAD to words. However, we found that converting the originally-numeric VAD scales to Lexical scales (e.g., +4.0 becomes “High”) dramatically improved agreement. Furthermore, the perception of how much a generated sentence conveys an emotion is highly dependent on the LLM, representation type, and which emotion it is.

Credits

Shadab Hafiz Choudhury 1 3, Asha Kumar 2, Dr. Lara Martin 1 3

1: Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County
2: Department of Information Systems, University of Maryland, Baltimore County
3: Corresponding Authors

Citation

@misc{choudhury2025gptsdevastatedllamascontent,
      title={GPT's Devastated and LLaMA's Content: Emotion Representation Alignment in LLMs for Keyword-based Generation},
      author={Shadab Choudhury and Asha Kumar and Lara J. Martin},
      year={2025},
      eprint={2503.11881},
      archivePrefix={arXiv},
      url={https://arxiv.org/abs/2503.11881},
}