Imitating reality: the role of mathematical models

Ghislaine Gayraud and Miraine Davila Felipe, respectively university professor and senior lecturer at UTC’s LMAC, are interested in the stochastic modelling of complex real systems. Irene Maffucci, a senior lecturer attached to GEC, specialises in structural bioinformatics. The three scientists are participating in the Lyme disease project led by Séverine Padiolleau.
What is the role played by mathematicians in this project? “The idea is to propose a stochastic model that generates in silico new DNA sequences resembling (from a probabilistic point of view) those obtained in vitro by Séverine’s team. From this family of sequences generated in silico, the next step is to select a few, known as ‘probes’, which appear to have good pairing capabilities with the protein of interest. Their selection consists of looking for those that are ” close”, in terms of mathematical distance, to those that Séverine and Irene’s team have labelled as suitable candidates from among all the sequences obtained in vitro by SELEX. The entire study process begins with the experimental part (SELEX), continues with a mathematical model and tools, then back to the experimental part to validate the sequences selected in silico,” explains Ghislaine Gayraud.
What does “sequence” is mean”? “DNA is composed of four nitrogenous bases: A, T, C, G. A sequence is the succession of these bases positioned in a particular order and over a given length. The diversity of 1014 molecules mentioned by Stéphane corresponds to 1014 different sequences of length 40 at the initial stage of SELEX. The aim of this procedure is to identify potential probes in vitro,” emphasises Séverine Padiolleau.
How does the in vitro procedure work? “During this very time-consuming and complex process, there will be a number of sequences capable of recognising the protein of interest, but still fewer than all the possibilities. This is a major limitation of SELEX,” explains Miraine Davila Felipe.
This is where mathematics comes in. “Based on our in vitro results, Ghislaine and Miraine will propose a mathematical model to generate in silico new sequences that are different from ours but potentially functional. Of course, one might wonder what the point of this is, since we already have a good candidate in vitro. In fact, targeting the same target is not an end in itself. On the one hand, it increases the number of probes available. On the other hand, this model could be applied to target other proteins of interest without having to implement the SELEX procedure,” details Séverine Padiolleau.
Ghislaine and Miraine, supported by a post-doc, drew on an existing family of models, the Restricted Boltzmann Machines (RBM) family. ” The RBM is a two-layer graphical model with an input layer consisting of the sequence and a hidden layer that is supposed to take into account the 2D or, even better, the 3D structure of the sequence when it folds. The presence of this hidden layer is important because the way the sequence folds is fundamental to ensuring its pairing with the target protein,” adds Ghislaine.
Numerous thesis in support of the project
Work on Lyme disease began in 2021 with Mickaël Guérin’s thesis. At UTC’s GEC Laboratory, it has continued since 2024 with Hugo Da-Ponte’s thesis supervised by Séverine Padiolleau, Selma Benguaouer’s thesis co-supervised by Irene Maffucci and Bérangère Bihan-Avalle, and the support of Pauline Trézel, a post-doc. At the LMAC, post-doc El Mehdi Issouani, who is supporting Miraine Davila Felipe and Ghislaine Gayraud. Marc Shawky is currently co-supervising, with Florian de Vuyst, Teresa Ciavattini’s thesis on the classification of patient data in cooperation with Mater Misericordiae University Hospital, Dublin, Ireland. Finally, a post-doctoral researcher will be recruited to develop machine learning tools to assist clinicians in their decisions to start a given treatment.
MSD




