Doctoral Defense of Luiz Alberto Ferreira Gomes

23 Jun 2021

09:00 Doctoral defense Fully distance

Theme

Prediction-based software maintenance: a machine learning perspective

Student

Luiz Alberto Ferreira Gomes

Advisor / Teacher

Advisor: Mário Lucio Côrtes / Co-supervisor: Ricardo da Silva Torres

Brief summary

Software maintenance in Free / Libre Open Source (FLOSS) is based mainly on information extracted from bug reports registered in Bug Tracking Systems (BTS). This type of system is considered essential in communication and collaboration in both Closed Source Software (CSS) and FLOSS environments. This fact is particularly true for the latter, an environment characterized by many users and developers with different specialties, spread around the world. Such users and developers interact with BTS through bug reports, thus allowing communication with those responsible for maintaining the software. Users must then complete a bug report, providing a title, description and severity level. After completing the completion, a member of the maintenance team will review the information provided, and approve or reject the bug report. If the bug report is approved, the team member will provide more information, such as the indication of its priority and the assignment of a responsible person to correct the bug. In this scenario, the number of bug reports in large and medium-sized FLOSS projects is often high. The manual handling of this volume of bug reports can be totally tiring and subject to errors; and a wrong decision can seriously affect the planning of maintenance activities for that project. Due to this difficulty and the evident importance of the information contained in the bug reports for the planning of FLOSS maintenance, both the industry and the academic community have been showing a lot of interest in this problem, and much research has been carried out in this area. These efforts have been based mainly on traditional Machine Learning (AM) and Text Mining (MT) techniques. Such techniques have been applied successfully in solving real problems in many areas, including those related to BTS, such as the automatic assignment of the person responsible for fixing the bug. In a bug report, the severity level is one of the most critical variables for maintenance planning. It measures the impact of the bug on the execution of the software system and the time required for a bug to be resolved. On the other hand, many studies point out that most bugs that adversely affect the user experience and maintenance planning in versions of FLOSS are long-term bugs. A good portion of these long-lasting bugs can bore users for a long time and make it difficult for the management team to plan their activities, even in small quantities. In this context, our research focused on two critical areas, mentioned above, related to bug reports, providing the following contributions: a review of recent research efforts on automatic bug severity prediction that analyzed more than ten aspects of experiments published in the literature ; a survey and characterization of the long-term bug population in six popular FLOSS projects; and a comparison between five well-known AM algorithms in the task of predicting long-term bugs in the projects mentioned above. In addition, in our latest contribution, we propose the use of Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art deep neural learning network for Natural Language Processing (NLP), as a feature extractor, in contrast to the conventional method Term Frequency-Inverse Document Frequency (TF-IDF), to provide input features for pre-selected AM algorithms in long-term bug prediction. Our research efforts have produced a detailed and comprehensive view of the state-of-the-art of existing severity level forecasting approaches, indicating that traditional AM and MT applied to unstructured textual attributes of bug reports played a central role in the approaches presented . In addition, we demonstrate that it is possible to predict long-term bugs with good precision, despite the application of AM algorithms and simple MT methods, such as Neural Network and TF-IDF, in unstructured textual attributes of a bug report.

Examination Board

Headlines:

Mário Lucio Côrtes	IC / UNICAMP
Hélio Pedrini	IC / UNICAMP
Islene Calciolari Garcia	IC / UNICAMP
Rosana Teresinha Vaccare Braga	ICMC / USP
Maria Adriana Vidigal de Lima	FACOM / UFU

Substitutes:

Eliane Martins	IC / UNICAMP
Adler Diniz de Souza	IMC / UNIFEI
Alexandre Mello Ferreira	EEP

Doctoral Defense of Luiz Alberto Ferreira Gomes

Related

News

Institute of Computing opens Selection Process for the Postdoctoral Researcher Program

Partners