Mind Wandering and Multimodal Affect Detection in the Wild

research; machine learning; affective computing; cognitive science

Face Forward: Detecting Mind Wandering from Video During Narrative Film Comprehension

Overview: Attention is key to effective learning, but mind wandering, a phenomenon in which attention shifts from task-related processing to task-unrelated thoughts, is pervasive across learning tasks. Therefore, intelligent learning environments should benefit from mechanisms to detect and respond to attentional lapses, such as mind wandering. As a step in this direction, we report the development and validation of the first student-independent facial feature-based mind wandering detector. We collected training data in a lab study where participants self-reported when they caught themselves mind wandering over the course of completing a 32.5 min narrative film comprehension task. We used computer vision techniques to extract facial features and bodily movements from videos. Using supervised learning methods, we were able to detect a mind wandering with an F1 score of .390, which reflected a 31% improvement over a chance model. We discuss how our mind wandering detector can be used to adapt the learning experience, particularly for online learning contexts.


  • Mind wandering
  • Attention aware interfaces
  • User modeling


Accuracy vs. Availability Heuristic in Multimodal Affect Detection in the Wild

Overview: This work discusses multimodal affect detection from a fusion of facial expressions and interaction features derived from students’ interactions with an educational game in the noisy real-world context of a computer-enabled classroom. Log data of students’ interactions with the game and face videos from 133 students were recorded in a computer-enabled classroom over a two day period. Human observers live annotated learning-centered affective states such as engagement, confusion, and frustration. The face-only detectors were more accurate than interaction-only detectors. Multimodal affect detectors did not show any substantial improvement in accuracy over the face-only detectors. However, the face-only detectors were only applicable to 65% of the cases due to face registration errors caused by excessive movement, occlusion, poor lighting, and other factors. Multimodal fusion techniques were able to improve the applicability of detectors to 98% of cases without sacrificing classification accuracy. Balancing the accuracy vs. applicability tradeoff appears to be an important feature of multimodal affect detection.


  • User/Machine systems
  • Human factors
  • Affect detection

Publication: ICMI’15