Thematic School 2021 : Graph as models in life sciences: Machine learning and integrative approaches

Thematic School 2021 : Graph as models in life sciences: Machine learning and integrative approaches

“Graph as models in life sciences: Machine learning and integrative approaches” is a Fall School (mid-October 2021) supported by the Labex Digicosme on bioinformatics and statistical/machine learning, with graph as a central theme.

This thematic school is interdisciplinary and dedicated to the application of state-of-the-art machine learning methodologies to ongoing challenges in bioinformatic research. Recent developments of statistical methods integrating a priori knowledge of a domain have already demonstrated their efficiency in major subfields of bioinformatics (such as graph neural networks for protein analysis, variational autoencoders for transcriptomic data, …).

The objective of the school is to strengthen the links and exchanges between laboratories and researchers in this interdisciplinary community, but also to expose PhD students or young researchers working in one of the two disciplines (bioinformatics or machine learning) to the second. Having this opportunity early in a career is essential to bolster interdisciplinary research.

In particular, the school wishes to contribute to the dissemination of good practices in the use of machine learning as a modelling tool in biology. Speakers will detail methodologies and biological outcomes as well as risks, biases and their active mitigation in the context of predictive bioinformatics. Data privacy and leakage, for example in the medical field, are also part of the challenges faced by researchers, companies and civil societies. They raise technical questions specific to the data being handled.  

Hence, we believe it is stimulating to address these methodologies and issues in a contextualized way, while providing sufficient theoretical ground to allow participants to transfer the acquired knowledge to other biological problems.


25th – 29th of October




Registration is free but mandatory*

* Limited number of places for the tutorial


Day 1 – Monday 25th October8 h 45 – 9 h 15IntroductionFlora Jay / Yann Ponty
9 h 15 -12 h 45Lecture 1Laurent Jacob
14 h 00-17 h 30Lecture 2Simona Coco
Day 2 – Tuesday 26th October9 h 00 -12 h 30Tutorial topic 1, 2Laurent Jacob /
Simona Coco
14 h 00 -17 h 30Lecture 3Sergei Grudinin
Day 3 – Wednesday 27th October9 h 00 -12 h 30Lecture 4Chloé-Agathe Azencott
14 h 00 -17 h 30Tutorial topic 3, 4Chloé-Agathe Azencott /
Sergei Grudinin
Day 4 – Thursday 28th October9 h 00 – 12 h 30Lecture 5Andrei Zinovyev
14 h 00 – 17 h 30Participants’ talks/posterEveryone
Day 5 – Friday 29th October9 h 00 – 12 h 30Lecture 6Jean Louis Raisaro
14 h 00 – 17 h 30Tutorial topic 5, 6Jean Louis Raisaro /
Andrei Zinovyev

*May change for exact times


  • Simona Coco
  • Sergei Grudinin
  • Jean Louis Raisaro


  • Lecture/Tutorial 1Laurent Jacob: Learning with biological sequences, from neural networks to De Bruijn graphs
    The tutorial will introduce elementary tools to make prediction from unaligned biological sequences and to perform genome wide association studies over such sequences. 
    Keywords: convolutional neural networks, sequence motifs, k-mers, compacted de Bruijn graphs, bacterial GWAS.
  • Lecture/Tutorial 2Simona Coco: Statistical learning to infer structural evolution (DCA, RBMs…)
  • Lecture/Tutorial 3Sergei Grudinin: Neural networks (CNN, GNN, …) for protein structure and interaction prediction
  • Lecture/Tutorial 4Chloé-Agathe Azencott: Boosting genome-wide association studies (GWAS) with biological networks
    The lecture will be based on the following publication:
    Tutorial: GWAS and biological networks. The tutorial will be the opportunity for participants to dig deeper in both the motivation for incorporating biological networks into GWAS and several of the existing models for this purpose. The tutorial will be organized as a discussion around several published models. Participants are welcome to join with their own questions for discussion. This is not a hands-on session. Keyword: networks, gwas, graph regularization, diffusion on graphs, omnigenic model
  • Lecture/Tutorial 5Andrei Zinovyev: Structured learning for single-cell differentiation trajectories
    Thanks to the emergence of single cell assays, it is now possible to measure gene expression and other genome-scale molecular profiles levels across thousands to millions of single cells (scRNA-Seq, scATAC-seq, single cell proteomic data). Using these data, it is possible to look for paths in the data that may be associated with the level of cellular commitment w.r.t. a specific biological process and use the position (called pseudotime) of cells along these paths (called trajectories) to explore how gene expression reflects changes in cell states as the cells progressively commit to a given fate. This kind of analysis is a powerful tool that has been used, e.g., to explore the biological changes associated with development, cellular differentiation and cancer biology. Several graph-based approaches have been suggested to extract cellular trajectories from single cell data, including minimal spanning trees and principal graphs. Principal graphs approximate the multivariate data by a graph injected into the data space with some constraints imposed on the node mapping. In the lecture and tutorial, we will explore the basic concepts and tools for application of graph-based approaches to single cell scRNA-Seq datasets.
  • Lecture/Tutorial 6 – Jean Louis Raisaro: Privacy-preserving learning for health/life science data


  • Flora Jay, CR CNRS, LISN
  • Yann Ponty, DR CNRS, LIX
  • Ariane Migault, Chargée de communication du Labex DigiCosme