Aim
This web server focuses on the identification of compact small units in 3D structures of proteins. These protein fragments are named "protein units (PU)". They correspond to a new level of description of protein structures. They allow a better description of protein structures organization and are an interesting tools to analyze architecture.
The method only works from the contact probability matrix, i.e., the Capha-distances translated into probabilities. Its results are comparable to a conventional hierarchical clustering, leading to a series of nested partitions of the 3D structure. Every step aims at dividing optimally a unit into 2 or 3 sub-units according to a criterion called "partition index" assessing the structural independence of the sub-units newly defined. Statistical criteria (R2) assess the protein structure dissection.
Using "Protein Peeling" approach on a protein structure (in PDB file format), following analysis could be made:
- Protein Unit identification and architecture involvement. The approach "Protein Peeling" aims at cutting the 3D protein structure into a limited set of "Protein Units" base of internal and external contacts.
- Domains identification. Protein Peeling is able to delineate protein domains using Domain Reconstruction Algorithm. Parameters called Contact Ratio (CR) and Contact Probability Density (CPD) are able to identify PU or merged PUs as potential protein domains. Theses criterions have been optimized through a refined published benchmark of protein domains. On this benchmark, DR can find more than 80% of domains delineation proposed by authors. An interesting feature of DR is his ability to propose alternative delineation.
- Unstructured N or C terminal segments recognition. In a previous study, we have observed that half of the protein structures of a protein databank have flexible N or C termini. This fact can be problematic for protein analysis or dynamics. Protein Peeling can now locate them.
- A novel scoring function. Protein Unit exhibit a wide range of shape and many differences in term of internal contacts types which is related to energy. To characterize stability and evaluate PU contact energy we use a statistical potential to compute each PU pseudo energy.
How to Peeling a Protein ?
1- Upload a protein structure in pdb format throught form
Details on paramaters
- Modify R value for increase or decrease the importance of splitting. A hight R value imply a hight level of splitting.
- Specify the minimal size allowed to protein unit produced by algorithm.
- Under this value, continuous secondary structure were not splitted.
- Maximal size of protein unit. By default this value is not used. If a value is choosen, protein peeling process continu while protein unit size is superior to this value
Warnings:
- If PDB file contain more than one chain only the first is considered.
- PDB file is runumbered (first amino acid take number 1)
- Secondary structures were re attributed using DSSP algorithm
2- Waiting Results
3- Analyses your results
a) Summary
This section describe in a a small summary of peeling process results:- Levels of splitting events and Protein Units at final level
- Domain identification with positions
- Extremity identification
You can also download all output files, or only the cleaned pdb file use for the analysis (This pdb file is needed for a proper use of pymol script).
b)Domain organisation
- Optimal domain delineation
In this section, the best delineation found by Domain Recognition method, accordingly to different criterions (domain contact density and and contact between domain), are presented. - Alternative domain delineation
The others good delineations found by the algorithm are detailed. - JMOL Visualisation
Dynamic visualisation of structure and domains in a small jmol applet. By pressing specific button, coloration change.
c)Hierarchical process of Protein Peeling
- Proposition of Protein Units in Protein
The Protein Units found by Protein Peeling algorithm are described in this section. - Tree
The tree describe hierarchical splitting of protein by Protein Peeling.
- Probability Contact Probability Map
Visualisation of contact probability map
- Specific PU level visualisationi by JMOL applet
Dynamic visualisation of structure and domains in a small jmol applet. By pressing specific button, coloration change.
d) Detailled view of Protein Unit identified
- Level of Protein Peeling
- Open a pop up window with jmol of protein colored
- Protein Units
- CI parameters
- Energy (computed with a statistical potential)
- By pressing button you can access to a dynamic representation of structure in a jmol applet.
e) Paramaters choosen
Summary of parametersIllustrated principle
Details on domain reconstruction algorithm
One major potential for the Protein Peeling web server was the possibility to use the PP method to identify protein domains. We therefore developed a new algorithm called domain reconstruction (DR). We use two criterion to estimate quality of domains:- Contact Ratio (CR)
where cp(i,j) is contact probability between PUi and PUj, S(i) and S(j) are the length of PUi and PUj. Finally c(ij) and S(ij) are respectively the contact probability and the size of the whole domain formed by merging PUi and PUj. The α value is 0.43 as in the PDP method. High CR values indicate a high number of contacts between PUi and PUj; consequently theses PUs are good candidates for merging them into one domain.
- Contact Probability Density (CPD)
Formula of contact probability density (CPD) for a domain:
where cp(x,y) is contact probability between residue x and residue y and n the number of residues in considered domain.