Advances in computer vision and online face-to-face communication promise to rapidly reshape mental health research, allowing for exquisitely rich characterization of behavior from videos collected in individuals’ everyday environments. Unfortunately, the great promise of computer vision techniques is hampered by technical artifacts introduced by uncontrolled environments and variation in data collection devices (e.g., personal phones).
We develop methods that are robust to various technical factors, such as illumination, pose, motion to precisely capture social behavior across uncontrolled conditions. Our methods can decompose a facial image into multiple factors including personal identity, illumination, facial expression, and pose. This allows us to measure facial expressions in a pose-independent manner, as well as to quantify and simulate illumination conditions with high accuracy.