Offerta formativa anno accademico 2024/2025

Primo anno

Attività
Corsi di carattere istituzionale avanzato SI
Attività di tipo seminariale o di laboratorio SI
Attività connesse con la ricerca SI
Attività formative e di ricerca autonomamente scelte dal dottorando e approvate dal Collegio dei Docenti SI
Elenco dei corsi/attività 
corso/attivitàore
Indices of Centrality for Complex Networks and their Efficient Computation
We introduce the main centrality/role indexes to rank nodes and/or data in large complex networks and then we describe algorithmics methods to efficiently compute them.
Tipologia: altro
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
CIclo di Seminari
15
Mining Massive Data
The course is organized in 3 modules. - Mining Huge Data Sets. One of the key problems in Data Mining is to fastly recover all items that are similar according to effective notions of similarity, such as the Jaccard one. We cover the Locality Sensitive Hashing technique that can be used to break the n^2-time barrier required to solve the problem in the worst case. We introduce the framework of Streaming Algorithms, algorithms for problems in which the input is so huge that it cannot even be stored in the memory and the algorithm can look at each element of the input just once in an online fashion. We study the problems of counting distinct elements, finding the most frequent elements and finding the number of elements in a given queried window that meet a certain criterion. - WEB Search Engine. The Page-Rank Algorithm: Introduction to the key algorithmic ideas of the Page-Rank Algorithm and how it computes an effective Popularity Score the modern WEB search engines applies to rank WEB sites and pages. - The Bitcoin Lightning Network. Introduction to the fully-decentralized system designed to manage the massive data yielded by the micropayments that take place over the BitCoin Networks. Expected Background: Undergraduate Courses in Algorithms & Data Structures and in Probability
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Lecturers: Prof. Clementi, Prof. Gualà, Prof. Pasquale ({clementi,guala,pasquale}@mat.uniroma2.it)
12
Web of Data
The course introduces the Web of Data, as outlined by the Semantic Web and Linked Open Data in terms of an extension of the Web as a global dataspace for publication, reuse and integration of data. Best practices and (open) standards will be discussed as part of the course, emphasizing machine actionability, a core value of the FAIR paradigm for data custody. Emphasizing the distributed and decentralized nature of the Web of Data, prerequisites for autonomy and independence, we will discuss how to avoid the data silos phenomenon through a distributed and as-needed integration process. In this regard, ontology matching and entity linking techniques will be discussed. The dual role of the Web of Data as a controlled environment for Big Data experimentation and as a source of background knowledge for information extraction and content analytics, in general, will be mentioned. Various examples of big datasets will be shown throughout the course, including general resources, such as DBpedia and Wikdata, and more domain-specific GLAM (Galleries, Libraries, Archives, and Museums) resources. Concrete examples of tabular data lifting and modern standards for their semantic annotation will also be shown. Background: Foundations of Logic, Databases, and Java or Python Languages
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Lecturer: dott. Manuel Fiorelli (fiorelli@info.uniroma2.it)
14
Hands on Machine Learning for Physics
The course is aimed at deepening the concepts, techniques and tools needed to construct Machine Learning algorithms mainly used in Physics. The target audience are PhD students who want to learn how to program Machine Learning (ML) codes for data analysis of physics problems. During the course, special emphasis will be given to the study of the operation of generative algorithms, such as Variational Auto-Encoders and Generative Adversarial Networks. It starts with a brief theoretical reminder of the problem and then continues with the implementation of a ML algorithm in all its phases, from the construction of the dataset, to the validation of the results.
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Lecturers: Michele Buzzicotti (m.buzzicotti@gmail.com)
14

Eventuali maggiori informazioni piano form. 1°a
Media totale ore/anno 20
Totale ore corsi 3
Altre attività didattiche Lezioni ed Esercitazioni nei Corsi di Laurea Triennale e Magistrale
Modalità di scelta del soggetto della tesi Autonoma
Modalità delle verifiche per l'ammissione all'anno successivo Raccolta informazioni, Seminario Finale sulle attività svolte nell'anno e sull'avanzamento verso la Tesi di Dottorato
Note

Secondo anno

Attività
Corsi di carattere istituzionale avanzato SI
Attività di tipo seminariale o di laboratorio SI
Attività connesse con la ricerca SI
Attività formative e di ricerca autonomamente scelte dal dottorando e approvate dal Collegio dei Docenti SI
  Elenco dei corsi/attività 
corso/attivitàore
Deep Learning ad Structured Inference – Neural Models and Algorithms for Linguistic Recognition and Inference
Modern AI is growingly faced with complex problems, characterized by heterogeneous forms of structured evidence in input and complex decisions. In medicine historical data, biological phenomena or images manifest through streams of structured data, usually digitally represented into sequences, trees or graphs. Machine Learning methods for structured learning have been studied whereas some mathematical paradigms (such as dimensionality reduction, structured kernels or neural embedding) have been proposed as modeling tools. In Natural Language Processing, Machine Translation and other Natural Language Inference (NLI) tasks, such as Question Answering or Textual Entailment, have been approached via kernels or neural models of the input representation. These achieved accurate state-of-the-art classification and prediction capabilities by enabling the exploration of huge spaces of possible solutions (e.g. target sequences or decisions). In this way, they correspond to both enabling technologies and software tools as well as to models of investigation able to systematically select hypotheses and validate controversial theories about linguistic phenomena. The application of these empirical methodologies to other areas like biology, medicine and medical robotics is more than promising, given the similar complexity of the domains targeted by AI and Life Sciences. The course will try to promote this interesting research perspective in Deep Learning to PhD students with a specific focus, but not limited to, Life Science phenomena.
Tipologia: altro
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Ciclo di Seminari
12
An introduction to score-based generative models
In simple words, generative modeling consists in learning a map capable of generating new data instances that resemble a given set of observations, starting from a simple prior distribution, most often a standard Gaussian distribution. This course aims at providing a mathematical introduction to generative models and in particular to Score-based Generative Models (SGM). SGMs have gained prominence for their ability to generate realistic data across diverse domains, making them a popular tool for researchers and practitioners in machine learning. Participants will learn about the methodological and theoretical foundations, as well as some practical applications associated with these models. The first two lectures motivate the use of generative models, introduce their formalism and present two simple though relevant examples: energy-based models and Generative Adversarial Networks. In the third and fourth lecture we present score-based diffusion models and explain how they provide an algorithmic framework to the basic idea that sampling from the time-reversal of a diffusion process converts noise into new data instances. We shall do so following two different approaches: a first elementary one that only relies on discrete transition probabilities, and a second one based on stochastic calculus. After this introduction, we derive sharp theoretical guarantees of convergence for score-based diffusion models assembling together ideas coming from stochastic control, functional inequalities and regularity theory for HamiltonJacobi-Bellman equations. The course ends with an overview of some of the most recent and sophisticated algorithms such as flow matching and diffusion Sch¨odinger bridges (DSB), which bring an (entropic) optimal transport insight into generative modeling.
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: umanistica
Elevata formazione: SI
Verifica finale: SI
Lingua: INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Lecturers: Proff. Giovanni Conforti (Università di Padova) and Alain Durmus (École Polytechnique, Parigi)
12
Quantile regression
The main techniques of quantile regression, an alternative to classical linear regression, will be introduced. As an example, consider a regression model in which we estimate the association between Equivalised Disposable Income of a sample of households and various predictors, including an exogenous treatment. Using quantile regression, it is possible to estimate the effect of treatment on the entire distribution of households, resulting in a potentially different estimated effect at each quantile. Indeed, the treatment could be positive for the income of rich households (high quantiles) and negative for the income of poor households (low quantiles). Similarly, the association of predictors with median income can be evaluated, avoiding the need to assume that the response is Gaussian (symmetric, homoschedastic) and that there are no outliers. If time permits, principles of robust statistics will also be discussed, including linear regression techniques and robust prediction. Background: Use of software R. Undergraduate Courses in Statistical Inference and Linear Models
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Prof. Alessio Farcomeni (alessio.farcomeni@uniroma2.it)
0
Simulation-based Predictive Process Mining
The course introduces the essential elements of process mining (PM) and simulation. These approaches are initially proposed as tools for analyzing processes from different perspectives, to achieve different objectives. While PM aims to extract knowledge by analyzing a log that records data on past process executions, simulation provides predictions on future or alternative behaviors of the same process. Then, an innovative point of view is proposed in which PM and simulation are seen as complementary tools whose joint adoption leads to an effective analysis paradigm. The first part of the course introduces basic concepts on simulation: simulation modeling, discrete event simulation, local and distributed simulation. The implementation of a Java-based discrete event simulator is also discussed. In the second part, principles, methods, and tools for PM are provided. Finally, the course introduces “Predictive Process Mining” as an innovative paradigm based on the joint use of the two approaches. It is outlined how the knowledge extracted from the log analysis through PM techniques can be used to guide the development of a simulation model, whose execution provides further insights into the system under study. In this context, the most relevant research challenges, opportunities and open issues are illustrated. Background: Basic skills in software development and knowledge of at least one object-oriented programming language (Java recommended).
Tipologia: scuole di formazione dedicate
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: SI
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
Lecturer: dott. Paolo Bocciarelli (paolo.bocciarelli@uniroma2.it)
12

Eventuali maggiori informazioni piano form. 2° Gli studenti hanno un'ampia scelta nei primi due anni di Dottorato tra Corsi di natura differenziata legati ai paradigmi matematico-algoritmici, ai metodi ed alle tecnologie di riutilizzo dei dati in diversi ambiti sperimentali, modellistici e applicativi (industriali).

Ad essi è richiesta la partecipazione ad almeno due Corsi.
Media totale ore/anno 12
Totale ore corsi 12
Altre attività didattiche Sperimentazione di laboratorio, didattica frontale in supporto ai Corsi di Magistrale
Modalità di preparazione della tesi In collaborazione con centri di ricerca locali o esteri
Modalità delle verifiche per l'ammissione all'anno successivo Seminari di ricerca, relazione sintetica e Piano della Ricerca per la Tesi di Dottorato
Note

Terzo anno

Attività
Corsi di carattere istituzionale avanzato NO
Attività di tipo seminariale o di laboratorio SI
Attività connesse con la ricerca SI
Attività formative e di ricerca autonomamente scelte dal dottorando e approvate dal Collegio dei Docenti NO
Elenco dei corsi/attività 
corso/attivitàore
Preparazione della Tesi di Dottorato
Seminari sedute di sperimentazione
Tipologia: workshop
Tipo corso: internazionale
Macrosettore: open science
Area: scientifica
Elevata formazione: SI
Verifica finale: NO
Lingua: ITALIANO/INGLESE
Modalità: riconducibile al progetto formativo del Dottorando
0


Eventuali maggiori informazioni piano form. 3° Il piano di Ricerca prodotto al Secondo anno, viene confermato alla fine dello stesso e monitorizzato dal COllegio durante il primo semestre del Terzo anno.
Media totale ore/anno 25
Totale ore corsi 25
Altre attività didattiche Seminari, Supporto alla Didattica dei corsi di Magistrale
Modalità di ammissione all'esame finale Seminari di ricerca, Aggiornamenti sul progresso verso la Tesi Finale di Dottorato
Modalità di svolgimento dell'esame finale Valutazione di revisori esterni, Valutazione Commissione interna, Difesa della Tesi di fronte ad un collegio di tre docenti
Note

Università degli Studi di Roma "Tor Vergata" - Via Cracovia, 50, 00133 Roma RM