02 out 2020
14:00 Master's Defense Fully distance
Theme
Cross-dataset emotion recognition from facial expressions through convolutional neural networks
Student
William Marques Dias
Advisor / Teacher
Anderson de Rezende Rocha
Brief summary
The face is the window of the soul. This is what the XNUMXth century French doctor, Duchenne de Boulogne, thought. Using electric shocks to stimulate muscle contractions and induce frightening and bizarre-looking expressions, he wanted to understand how muscles produced facial expressions and thus reveal the most hidden human emotions. After two centuries, this field of research remains very active, arousing the interest of various segments of the industry. We see automatic systems for recognizing emotion and facial expression being applied in medicine, in security and surveillance systems, in advertising and \ emph {marketing}, among others. But despite its widespread adoption, there are still fundamental questions that scientists are trying to answer when we analyze a person's emotional state from their facial expressions. Is it possible to safely infer someone's internal state based only on the movements of their facial muscles? Is there a universal facial setting to express anger, disgust, fear, happiness, sadness and surprise, commonly called basic emotions? In this research, we try to answer these questions by exploring convolutional neural networks. Unlike most studies available in the literature, we are particularly interested in examining whether characteristics learned in one group of people can be used to successfully predict the emotions of another. In this sense, we adopted an evaluation protocol in a set of crossed data to measure the performance of the proposed methods. Our basic method was built from the fine adjustment of a model originally used in the problem of facial recognition to the problem of categorizing emotions. We then applied data visualization techniques to understand what our core network had learned so that we could derive three other methods. The first method aims to direct the network's attention to regions of the face considered important in the literature, but ignored by our initial model, using a multi-branched architecture for a part-based approach. In the second method, we simplified this architecture and worked on the input data, hiding random parts of the facial image, so that the network could learn discriminative characteristics in different regions. In the third method, we explore a loss function that generates representations of data in high-dimensional spaces, so that examples of the same emotion class are close and examples of different classes are distant. Finally, we investigate the complementarity between two of our methods, proposing a late fusion technique that combines its results by multiplying probabilities. For the purpose of comparing our results, we have compiled an extensive list of papers evaluated in the same sets chosen. In all of them, when compared to studies that followed an evaluation protocol in a single data set, our methods present competitive numbers.
Examination Board
Headlines:
Anderson de Rezende Rocha IC / UNICAMP
Teófilo Emidio de Campos CIC / UnB
Paula Dornhofer Paro Costa FEEC / UNICAMP
Substitutes:
Raphael Felipe de Carvalho Prates IC / UNICAMP
Marley Maria Bernardes Rebuzzi Vellasco PUC-Rio