Welcome to the PoincaréMSA projection tool

PoincaréMSA is a computational tool for visualization of large protein families developed by Susmelj et al. PoincaréMSA builds an interactive projection of an input protein multiple sequence alignment provided by the user or built on-the-flow from the target sequence. The underlying algorithm is based on Poincaré maps introduced by Klimovskaia et al.. It successfully reproduces both local proximities between protein sequences as well as global hierarchical structure of the data.

You can find a step-by-step explication of PoincaréMSA construction on the "Tutorial" page and several examples of PoincaréMSA projection for protein families on the "Examples" page. The source code is available at https://github.com/DSIMB/PoincareMSA.

Please use the following reference when citing the PoincaréMSA:

Susmelj A.K., Ren Y., Vander Meersche Y., Gelly J.C., & Galochkina T. (2023). Poincaré maps for visualization of large protein families. Briefings in bioinformatics, -. https://doi.org/10.1093/bib/bbad103

Method availability
PoincaréMSA is available as interactives Google Colab notebooks:
  • PoincareMSA_colab.ipynb takes as input a MSA in .mfasta format provided by a user. The user can also provide an annotation in .csv format which will be used for coloring, as well as an UniProt IDs list used to automatically fetch taxonomy informations.
  • PoincareMSA_colab_MMseqs2.ipynb performs a homologous sequence search for a target sequence and filtering of the resulting alignment with further projection by PoincaréMSA.
  • PoincareMSA_colab_examples.ipynb builds PoincareMSA projections from the example alignments available in examples directory.
PoincaréMSA is also implemented in open source Python code on github