30 Nov 2020
14:00 Doctoral defense Fully distance
Theme
Cross-domain emotion detection in tweets
Student
Fernando José Vieira da Silva
Advisor / Teacher
Ariadne Maria Brito Rizzoni Carvalho
Brief summary
Emotions are extremely important for human beings, being the focus of extensive studies in several areas of Science for many years. In Computer Science, on the other hand, the subject has been receiving increasing interest, especially in the detection of emotions automatically. This thesis addresses the task of automatic detection of emotions in tweets - short texts published on social media - written in Portuguese. In particular, she studies the domain adaptation problem, in which a machine learning algorithm is trained with a data set extracted from a corpus in a source domain, but is tested on samples obtained from a corpus in another domain " target". The thesis argues that an algorithm that uses the syntactic tree structure of tweets to learn to identify the emotions expressed in them obtains better performance for domain adaptation than a similar algorithm that is based only on the frequency of words. To validate this thesis, two corpora were built and annotated with emotions: a corpus of tweets automatically annotated, dealing with different subjects, and a corpus with tweets annotated manually, related to the stock exchange. The first was used to train the algorithms and the second was used only for testing, in order to compare the algorithms for domain adaptation. In the end, the thesis concludes that the algorithm that uses the syntactic tree structure in fact obtains better results only for the "surprise" emotion. In addition, it also presents better results for another 4 emotions (out of a total of 8 evaluated), but without statistical test support. However, the methodology used was not effective in discriminating neutral tweets - which do not express any emotion, with the need for future studies in this direction. In addition to this conclusion, the work presented in this thesis also showed that the proposed methodology obtained better results in detecting emotions in general, ignoring the problem of domain adaptation. However, these results, when compared with related works, are still inferior to those presented by Deep Learning techniques, with the exception of the emotion "anticipation", which obtained the best results among all the previous works reviewed. However, comparisons with previous works should be viewed with reservations, since each one used different corpora.
Examination Board
Headlines:
Ariadne Maria Brito Rizzoni Carvalho IC / UNICAMP
Thiago Alexandre Salgueiro Pardo ICMC / USP
Ivandré Paraboni EACH / USP
Tomasz Kowaltowski IC / UNICAMP
Julio Cesar dos Reis IC / UNICAMP
Substitutes:
Anderson de Rezende Rocha IC / UNICAMP
Jacques Wainer IC / UNICAMP
Clodoaldo Aparecido de Moraes Lima EACH / USP