20January2026
14:00 Doctoral defense room 85 of IC2
Topic on
From Consensus to Curriculum: Advancing Deep Semi-Supervised Learning via Meta-Pseudo-Labeling on 2D Projections
Student
David Aparco Cardenas
Advisor / Teacher
Pedro Jussieu de Rezende - Co-advisor: Alexandre Xavier Falcão
Brief summary
Deep learning has been successful in image classification, but its reliance on large volumes of labeled data remains a critical barrier, especially in areas such as medical and biological image analysis, where data annotation is expensive, time-consuming, and requires expertise. Although current semi-supervised learning (SSL) methods have shown progress in solving this problem, they rely on large-scale pre-trained coders and extensive validation sets, which restricts network architecture, introduces unwanted biases, and requires more labeled data in the validation process. Aiming to overcome these challenges, this thesis presents four contributions based on the Deep Feature Annotation (DeepFA) methodology, which explores the underlying structure of deep feature embeddings through graph-based label propagation over two-dimensional projections. Each contribution expands on the previous one, addressing specific limitations of semi-supervised deep learning (DSSL) and proposing targeted strategies to simultaneously reduce annotation effort and improve generalization ability. The first contribution introduces an approach that leverages user-drawn markers to initialize the weights of convolutional neural networks (CNNs) from scratch, using the Feature Learning from Image Markers (FLIM) methodology. By incorporating a consensus mechanism across multiple pseudo-labeling iterations, this approach improves label reliability without relying on pre-trained coders or extensive validation sets. The second contribution explores supervised contrastive learning in an iterative co-training scheme, in which two collaborative networks exchange trusted pseudo-labels. This strategy improves the quality of representations and mitigates confirmation bias through iterative cross-training and adaptive adjustment of the loss function. The third contribution integrates active learning into a single-run co-training scheme, allowing for the selection of the most uncertain samples for annotation. This approach improves generalization while minimizing annotation effort by combining supervised contrastive learning, active learning, and pseudo-labeling into a unified optimization process. The fourth and final contribution proposes a selection of pseudo-labels based on curriculum, gradually incorporating unlabeled data based on adaptive trust thresholds. This approach simplifies the previous setup to a single network and combines unsupervised contrastive learning, supervised contrastive learning, and active labeling to improve training stability and robustness. Taken together, these contributions address key challenges of DSSL, including training custom CNN architectures from scratch, reducing reliance on validation sets, improving feature representations, mitigating confirmation bias, and optimizing annotation efficiency. Evaluated across a variety of challenging biological image datasets, each approach demonstrated performance comparable to or superior to the state of the art, offering practical solutions for real-world scenarios where labeled data are scarce and expensive to obtain.
Examination Board
Headlines:
Pedro Jussieu de Rezende IC / UNICAMP
David Menotti Gomes DInf / UFPR
Nina Sumiko Tomita Hirata IME / USP
Marcelo da Silva Reis IC / UNICAMP
Marcos Medeiros Raimundo IC / UNICAMP
Substitutes:
Priscila Tiemi Maeda Saito DC / UFSCar
Moacir Antonelli Ponti ICMC / USP
Roberto Hirata Junior IME / USP