SeEar Project Image

Hatthini

Conversational Companion for Emotional Regulation

Overview


We designed a large scale dataset on Emotion and Emotion-cause to enhance the performance of exisitng Emotionally Aware Dialogue Systems.

Role


  • Assisted in designing a comprehensive pipeline for data gathering, cleaning, labeling, and validation, resulting in a rich dataset of 700,000 emotion-cause tweets from an initial 1M dataset.
  • Implemented key preprocessing techniques:
    • Text normalization, tokenization, topic modeling, and semantic analysis to enhance dataset quality.
  • Incorporated human validation via Amazon Mechanical Turk for accuracy assessment.
  • Labeling the dataset with state-of-the-art LLMs like GPT-3.5, BART, and T5.
  • Fine-tuned and evaluated the state-of-the-art LLMs like GPT-3.5, BART, and T5 with the enhanced dataset for optimal performance.
  • Developed a mobile interface to interact with the emotional regulation companion.

Findings


  • 90% of emotion-cause labels in the dataset have an average rating of 4.0 or above across relevance, fluency, and consistency as reported by annotators.
  • Achieved optimal performance across tasks, demonstrating the dataset's value in advancing emotion-cause analysis.

Publications