What skills will be developed in the course?

The Data Science Diffusion Course aims to train professionals for the current job market, making it possible to find new business opportunities through data analysis and the creation of predictive models, using the most modern machine learning methods.


Diffusion Course

University Extension Modality

Email: datascience@ic.unicamp.br

Phone: (19) 3521-5883 / 3521-5861

Presented by:

About the course


The Data Science Diffusion Course consists of 5 modules that teach the main concepts required by the job market, making a total workload of 100 hours, with 80 hours of classes and 20 hours of supervised activities.

Digital Certificate

Approved students will be entitled to the certificate of participation in the Data Science Diffusion Course, issued by the Unicamp Extension School.


The faculty of the Data Science Diffusion Course is made up of professors and researchers from Unicamp, with extensive experience in the area, all with doctorates.

  • Modules

    Introduction to Data Analysis using the R Language. Data types (vectors, lists, matrices, data frames, etc.). Predefined functions. Implementation of functions in R. Treatment, analysis and visualization of data.

    Discovery of knowledge. Understanding and prospecting for information. Exploratory data analysis. Anomaly detection. Association rules. Dimensionality reduction. Attribute selection. Grouping techniques.

    Classification problems. Decision boundaries. Linear and non-linear classifiers, logistic regression, decision trees and random forests. overfitting and validation. Methods of together: bagging, boosting e stacking. Cross validation. Imbalance, diagnosis of bias and variance. Evaluation measures. Model interpretation (X-AI) and classification in open scenario (open set).

    Vector Support Machines (SVMs): kernel (linear and non-linear), SVRs and SVM one-class. Regularization techniques. grid-search e random-search. Neural networks: types of networks, forward e backward propagation, and activation functions. Statistical tests.

    Deep learning and convolutional neural networks (CNN). Convolution: padding e screeches. Loss functions (loss functions). Training: activation, pre-processing, data augmentation, weight initialization and parameter optimization functions. Regularization. Learning transfer. Recurrent Neural Networks (RNN). Transformers. Detection and Segmentation. Generative Adversarial Networks (GAN). Interpretability (X-AI). Tools: TensorFlow and Keras.

  • Information

  • Prerequisite: Basic knowledge of programming.
    Target Audience: Computer professionals, trained in Computing or related areas (Engineering or Exact).
    Course type: Dissemination Course - University Extension Modality.
    Required Material: As it is a practical course, all students must use their notebooks in class.
    Course coordinator: Prof. doctor Zanoni Dias.
    Offering: Exclusively in the "in company" model (closed to companies). Request a quote by Email.

    Annually we offer the Improvement Course in Complex Data Mining, an expanded version of the Data Science Diffusion Course. Learn more at course website.

  • Clients