Structural Alphabets & Hierarchical Analysis
Standard protein structure analysis relies heavily on alpha-helices and beta-sheets. However, nearly 40-50% of residues fall into heterogeneous "coils" or "loops", regions that are often functionally critical (binding sites, active sites).
Our team pioneered the Structural Alphabet concept to resolve this "coil problem". Protein Blocks (PBs) are 16 structural prototypes (labeled a-p), each representing a specific conformational state of a pentapeptide (5 consecutive amino acids).
This allows a 3D structure to be encoded as a 1D string (e.g., "bccdtklmmm..."), enabling the application of sequence alignment algorithms to structural data, effectively bypassing the limitations of sequence-based homology detection.
Complementing PBs, we developed Protein Units (PUs), a hierarchical level of description between secondary structures and domains. PUs are compact structural units that facilitate the analysis of protein folding and flexibility.
The 16-state PB alphabet and Protein Units (PUs) allow for:
- More precise structure description - 16 prototypes vs. 3 states
- Enhanced sequence-structure relationship analysis
- Improved fold recognition for remote homologs
- Mining of conformational changes in protein dynamics
Deep learning & artificial intelligence
We integrate Protein Language Models (PLMs) and predictive AI to move from descriptive bioinformatics to generative methodologies.
Our tools leverage embeddings from PLMs (Ankh, ESM2, ProtTrans) that have learned the "language" of proteins from billions of sequences.
PEGASUS
PLM-based flexibility prediction. PEGASUS leverages Protein Language Model embeddings (Ankh, ESM2, ProtTrans) to predict structural flexibility metrics directly from sequence, superseding MEDUSA with enhanced performance.
Predicts RMSF, per-residue stability scores, and torsion angle deviations. Trained on the ATLAS database (Vander Meersche et al., Protein Science 2025).
MEDUSA
Deep Learning flexibility prediction. Our first AI tool for predicting protein flexibility (B-factors) from sequence using deep neural networks.
Provides rapid assessment of conformational dynamics without costly MD simulations (Vander Meersche et al., JMB 2021).
PYTHIA
CNN-based local structure prediction. Uses convolutional neural networks to predict Protein Block sequences directly from amino acid sequences.
Enhanced local structure prediction for proteins (Cretin et al., IJMS 2021).
View on GitHub →Key Research Directions
- Protein Language Model embeddings for flexibility prediction
- Convolutional neural networks for local structure
- Sequence-to-dynamics machine learning pipelines
- Critical assessment of AlphaFold2 predictions
Molecular Dynamics & Membrane Proteins
Computational biophysics is central to understanding how membrane proteins function in their native environment. We use extensive molecular dynamics simulations to explore the dynamic behavior of these crucial biological systems.
Biological membranes contain thousands of lipid species, and membrane protein function depends critically on this lipid environment. Our simulations capture this complexity to provide testable predictions.
ATLAS Database
A database of atomistic molecular dynamics simulations for proteins in the human proteome, providing per-residue flexibility metrics for benchmarking prediction methods (Vander Meersche et al., NAR 2023).
Explore ATLAS →Research Focus
- • Transporter protein mechanisms
- • Glucose transporter GluT1 dynamics
- • Lipid-protein interactions
- • Protein flexibility from MD trajectories
Structural Analysis & Fold Recognition
We develop methods for hierarchical protein structure analysis, from secondary structures to domains. These tools enable systematic decomposition and comparison of protein architectures.
Our team also contributes to the critical assessment of AlphaFold2 predictions, including evaluation of pLDDT confidence scores as flexibility proxies (Vander Meersche et al., Structure 2025).
Biomedical applications & red blood cell biology
Our algorithms have real clinical impact. We apply computational methods to transfusion medicine, blood group antigen modeling, and the complex pathologies affecting red blood cells.
The RBC is not a simple "hemoglobin bag", it's a complex system involved in immune defense, vascular regulation, and systemic signaling.
🩸 Transfusion medicine
Modeling blood group antigens (Rh, Kell, Duffy). When patients present with rare blood phenotypes, we predict if mutations destroy antigens or create new ones.
🔗 KNOTTIN Database
The KNOTTIN database catalogs cystine-knot miniproteins with therapeutic potential (Postic et al., NAR 2018).
Visit Database🍬 Glycobiology
The DIONYSUS database catalogs protein-carbohydrate interfaces from the PDB (Gheeraert et al., NAR 2024).
Visit Webserver💊 Drug design
In silico screening and design of molecules that interfere with disease mechanisms, targeting transporter proteins and adhesion proteins.
🧬 Essential Thrombocythemia
The CALR-ETdb catalogs Calreticulin variants in this myeloproliferative neoplasm (El Jahrani et al., Platelets 2021).
Visit Database🛡️ Antibody Engineering
ANABAG provides an annotated antibody-antigen dataset for benchmarking engineering tools (Grandguillaume et al., JCIM 2025).
View on GitHub🦙 VHH / Nanobodies
Our team has developed extensive expertise in camelid single-domain antibodies (VHHs/nanobodies). These miniature antibodies (~15 kDa) offer unique advantages for diagnostics and therapeutics due to their small size, high stability, and ease of production.
We pioneered the use of Protein Blocks to analyze VHH framework regions (Noël et al., Biochimie 2016), developed comparative modeling approaches (Vattekatte et al., IJMS 2021; Vishwakarma et al., IJMS 2022), and characterized VHH dynamics and humanization effects through molecular dynamics (Vattekatte et al., IJMS 2023; Martins et al., IJMS 2023, Molecules 2024).
This research connects to our work on the Duffy antigen receptor (DARC), where we developed anti-DARC nanobodies for blood group typing (Smolarek et al., Cell Mol Life Sci 2010).
FEDER-Funded projects
Erythropoietic protoporphyrias research
Camelid antibodies in biotechnology
Antibiotic resistance mechanisms
Funding & partnerships
Our research is supported by a robust funding network and strategic partnerships that span national, European, and international scales.
InIdex GR-Ex
The InIdex GR-Ex (Initiatives IdEx Globule Rouge d'Excellence) is funded by France's "Investissements d'Avenir" program. DSIMB serves as the transversal bioinformatics node for this national consortium, guaranteeing collaborative projects with top-tier clinicians and biologists across France.
Visit InIdex GR-Ex →ANR projects
Competitive grants demonstrating excellence in both basic and applied science.
ANR SugarPred
Deep Learning prediction of protein-carbohydrate interfaces. More info
ANR BASIN
Targeting the IL-3 pathway to inhibit basophil function in inflammatory conditions. More info
ANR ROPKIP
Role of Piezo1/KCNN4 interaction in red blood cell pathophysiology. More info