Facial expression analysis is one of the most studied problems in computer vision, motivated by numerous applications in industry, clinical research, entertainment, and marketing. Variations in head pose create a significant challenge for facial expression analysis, as expressions look significantly different from different angles. Disentangling facial expression from pose is important for improving expression recognition accuracy, as well as from an explainable AI standpoint, as one cannot reliably interpret the decisions of a facial behavior analysis system if the expression coefficients are confounded by head movements.
We study limitations of current approaches and modeling techniques (e.g., use of weak perspective camera model), as well as assessing “true” performance of state-of-the-art methods in behavioral research. To this end, we derive theoretical limitations of those methods and experimentally quantify their error bounds.