Computer Science 5th Year Master's Thesis Presentation

— 10:00am

Location:
In Person - Traffic21 Classroom, Gates Hillman 6501

Speaker:
CHENG CHARLES MA, Masters Student, Computer Science Department, Carnegie Mellon University
https://www.linkedin.com/in/cheng-charles-ma


Large Language Model Aided Modeling of Dyadic Engagement

Augmented reality (AR) glasses have become increasingly discreet and capable given the advancements in their design, sensor technology, and processing power over the past decade. Equipped with egocentric video cameras, gaze trackers, and other sensors, AR glasses offer a unique opportunity to study human behavior in an unobtrusive and naturalistic manner. This thesis specifically focuses on predicting engagement in dyadic social interactions, as engagement is a key component of human communication. The ability to understand and model engagement can augment and improve human communication, as well as inform the development of increasingly user-centric and socially intelligent technologies.

As part of this work, we assembled a novel dataset featuring 17 pairs of participants engaged in casual conversations recorded on AR glasses. Engagement was measured by each participant providing a self-reported engagement rating accompanied by external raters’ scores. Inspired by successful applications of Large Language Models (LLM) in other domains, we also introduce a novel fusion strategy to convert behavioral data into text and combine participant information to predict user engagement using multiple pretrained models and the inference abilities of LLMs. To the best of our knowledge, this is the first approach to “reason” about human behavior using language representations of non-verbal cues and LLMs. We show this can be a powerful, simple, and flexible framework for future work on modeling human behavior and the development of socially intelligent technologies.

Thesis Committee

Fernando De La Torre (Chair)
Daphne Ippolito Lori Holt (University of Texas at Austin)

Additional Information