“Graph as models in life sciences: Machine learning and integrative approaches” is a Fall School (mid-October 2021) supported by the Labex Digicosme on bioinformatics and statistical/machine learning, with graph as a central theme.
This thematic school is interdisciplinary and dedicated to the application of state-of-the-art machine learning methodologies to ongoing challenges in bioinformatic research. Recent developments of statistical methods integrating a priori knowledge of a domain have already demonstrated their efficiency in major subfields of bioinformatics (such as graph neural networks for protein analysis, variational autoencoders for transcriptomic data, …).
The objective of the school is to strengthen the links and exchanges between laboratories and researchers in this interdisciplinary community, but also to expose PhD students or young researchers working in one of the two disciplines (bioinformatics or machine learning) to the second. Having this opportunity early in a career is essential to bolster interdisciplinary research.
In particular, the school wishes to contribute to the dissemination of good practices in the use of machine learning as a modelling tool in biology. Speakers will detail methodologies and biological outcomes as well as risks, biases and their active mitigation in the context of predictive bioinformatics. Data privacy and leakage, for example in the medical field, are also part of the challenges faced by researchers, companies and civil societies. They raise technical questions specific to the data being handled.
Hence, we believe it is stimulating to address these methodologies and issues in a contextualized way, while providing sufficient theoretical ground to allow participants to transfer the acquired knowledge to other biological problems.
25th – 29th of October
Registration is free but mandatory*
* Limited number of places for the tutorial
|Day 1 – Monday 25th October||8 h 45 – 9 h 15||Introduction||Flora Jay / Yann Ponty|
|9 h 15 -12 h 45||Lecture 1||Laurent Jacob|
|14 h 00-17 h 30||Lecture 2||Simona Coco|
|Day 2 – Tuesday 26th October||9 h 00 -12 h 30||Tutorial topic 1, 2||Laurent Jacob / |
|14 h 00 -17 h 30||Lecture 3||Sergei Grudinin|
|Day 3 – Wednesday 27th October||9 h 00 -12 h 30||Lecture 4||Chloé-Agathe Azencott|
|14 h 00 -17 h 30||Tutorial topic 3, 4||Chloé-Agathe Azencott / |
|Day 4 – Thursday 28th October||9 h 00 – 12 h 30||Lecture 5||Andrei Zinovyev|
|14 h 00 – 17 h 30||Participants’ talks/poster||Everyone|
|Day 5 – Friday 29th October||9 h 00 – 12 h 30||Lecture 6||Jean Louis Raisaro|
|14 h 00 – 17 h 30||Tutorial topic 5, 6||Jean Louis Raisaro / |
*May change for exact times
- Simona Coco
- Sergei Grudinin
- Jean Louis Raisaro
- Lecture/Tutorial 1 –Laurent Jacob: Learning with biological sequences, from neural networks to De Bruijn graphs
The tutorial will introduce elementary tools to make prediction from unaligned biological sequences and to perform genome wide association studies over such sequences.
Keywords: convolutional neural networks, sequence motifs, k-mers, compacted de Bruijn graphs, bacterial GWAS.
- Lecture/Tutorial 2 – Simona Coco: Statistical learning to infer structural evolution (DCA, RBMs…)
- Lecture/Tutorial 3 – Sergei Grudinin: Neural networks (CNN, GNN, …) for protein structure and interaction prediction
- Lecture/Tutorial 4 – Chloé-Agathe Azencott: Boosting genome-wide association studies (GWAS) with biological networks
The lecture will be based on the following publication: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008819
Tutorial: GWAS and biological networks. The tutorial will be the opportunity for participants to dig deeper in both the motivation for incorporating biological networks into GWAS and several of the existing models for this purpose. The tutorial will be organized as a discussion around several published models. Participants are welcome to join with their own questions for discussion. This is not a hands-on session. Keyword: networks, gwas, graph regularization, diffusion on graphs, omnigenic model
- Lecture/Tutorial 5 – Andrei Zinovyev: Structured learning for single-cell differentiation trajectories
Thanks to the emergence of single cell assays, it is now possible to measure gene expression and other genome-scale molecular profiles levels across thousands to millions of single cells (scRNA-Seq, scATAC-seq, single cell proteomic data). Using these data, it is possible to look for paths in the data that may be associated with the level of cellular commitment w.r.t. a specific biological process and use the position (called pseudotime) of cells along these paths (called trajectories) to explore how gene expression reflects changes in cell states as the cells progressively commit to a given fate. This kind of analysis is a powerful tool that has been used, e.g., to explore the biological changes associated with development, cellular differentiation and cancer biology. Several graph-based approaches have been suggested to extract cellular trajectories from single cell data, including minimal spanning trees and principal graphs. Principal graphs approximate the multivariate data by a graph injected into the data space with some constraints imposed on the node mapping. In the lecture and tutorial, we will explore the basic concepts and tools for application of graph-based approaches to single cell scRNA-Seq datasets.
- Lecture/Tutorial 6 – Jean Louis Raisaro: Privacy-preserving learning for health/life science data
- Flora Jay, CR CNRS, LISN
- Yann Ponty, DR CNRS, LIX
- Ariane Migault, Chargée de communication du Labex DigiCosme