PredyFlexy software 1.3.1

1- A first test

   "python pred.py -t" allows testing the prediction tool.

the sentence "Congratulations FlexyPredy works correctly" show you it works correctly.

2- Usage

    usage is python "pred.py --help"

an example can be easily done with "python pred.py -f DATA/TEST/Prot_0.fasta"

[In fact for the output, you must do "python pred.py -f DATA/TEST/Prot_0.fasta --confidence --flex"]

the prediction will be find in the subdirectories of TMP[numbers]/PRED-FINAL/predictions.txt

 
	      identifiers
	      |	
>10MHA_0  <---/
  [...]
		  <--- first aa are not predicted

      ---amino acids          -- confidence index
     /                       /
  11 G  47  93   4   3  75  15   1  0.979  0.522 
  12 L   4  34  79 114  58  13   0  0.130 -0.027 
  13 R  92  96  95  57  89  15   0  0.209  0.080 
  14 F  57  96  95 106  55  13   0 -0.183 -0.367 
  15 I  79  56  58 107   8  15   0 -0.390 -0.415 
  16 D  57 107  55  43  95  13   0  0.032 -0.131 
  17 L  11  79  12  64  44  10   0 -0.004  0.025 
  18 F  42  11  12 110 117  10   0 -0.011  0.195 
   |     |   |   |   |    |      |      \     /
  /      \   \   |   /    /      /       \   /
number    \   \  |  /    /   flexibility  flexibility
           -5 prediced --     index        profils
	    prototypes                 (RMSf and B-factors)

  
Please not an important number of options (some are dedicated to blast, ...):

    Usage: pred.py -f the-fasta-file [options]
  
Options:
  -h, --help            show this help message and exit
  -f FILENAME, --file=FILENAME
                        obligatory                                [no default]
  -o OUTNAME, --output=OUTNAME
                        the name of the files                            [#NA]
  -D DIROUTPUT, --DIR=DIROUTPUT
                        the temporary directory               [default = /TMP]
  -c CPU_NUMBERS, --cpu=CPU_NUMBERS
                        number of cpus to be used              [default = all]
  --confidence          to compute the confidence index           [default=NO]
  --flex                prediction of flexibility                 [default=NO]
  -t, --test            test                                 [default=NO] [na]
  --PSIBLASTDIR=PSIBLAST_DIRECTORY
                        directory of psi-blast       [default: SOFTS/psiblast]
  --PSIBLASTdb=PSIBLAST_DB
                        databank of psi-blast (formatted)
                        [SOFTS/db]
  --PsBround=NB_ROUND   psi-blast number of round                [default = 4]
  --PsBeval=E_VALUE     psi-blast e-value                   [default = 0.0001]
  --PsBval=PSB_VALUE    psi-blast value                     [default = 0.0001]
  --Makemat=MAKEMAT_DIRECTORY
                        directory of makemat          [default: SOFTS/makemat]
  --GCM=GETMAT_DIRECTORY
                        directory of get_clean_matrix        [default: SOFTS/]
  --SVMclassify=SVMCLASSIFY_DIRECTORY
                        directory of SVM classify            [default: SOFTS/]
  --SVMscale=SVMSCALE_DIRECTORY
                        directory of SVM scale               [default: SOFTS/]
  -q, --quiet           say nothing                              [not default]
  -v, --verbose         verbose                                      [default]
  -w, --very-verbose    terribly verbose                         [not default]
  -n, --noban           No ban                                  [default: one]
  --firefox             firefox                                  [not default]
  
   
Default parameters have been optimized and corresponds to the results of the papers [1, 2, 3].

3- It does not work properly

Please check the versions of the different tools
  1. "blastpgp --help" must be blastpgp 2.2.9
    		blastpgp 2.2.9   arguments:
    			[...]
    
  2. be sure to use the swissprot databank of 2009
  3. go in SOFTS directory to test "./svm_classify", it must be SVM-light V6.01.
    	
    		> SVM-light V6.01: Support Vector Machine, classification module     01.09.04
    
  4. go in SOFTS directory to test "./svm-scale"
     
    	usage: ./svm-scale [-l lower] [-u upper] [-y y_lower y_upper]
    	      [-s save_filename] [-r restore_filename] filename
    	(default: lower = -1, upper = 1, no y scaling)"
    
   The two first parameters are crucial, as test with new blast gives very different PSSMs, and the databank has also a strong influence.

4- I've not the right version of the tool(s).

   Please search again on the web. Otherway, copies can be found in the subdirectory subdirectory download/external_src/ where you have download the archives of predyflexy1.1.tar.gz

5- get_clean_matrix can be compiled again if needed

		cd src/
		cc get_clean_matrix.c -o ../SOFTS/get_clean_matrix 

6- I've problems with CPUs.

   As the prediction is quite long due to the use of 120 SVMs for the prediction of LSPs [1], the number of CPUs used is given as 4.
Please change it to you number of avalaible CPUs with the option " -c CPU_NUMBERS, --cpu=CPU_NUMBER"

7- Why my nice results is in a strange directory named TMP18196054366918182720 (or similar name)?

   By default a random number is add to TMP.... subdirectory results. You can change it by using the option "D DIROUTPUT, --DIR=DIROUTPUT".

8- How to be sure everything is done properly?

An important number of checking is done, you must see (verbose mode ON):
	Fasta sequence done
	CPU(s) used:  4
	test of softwares done
	PSI-BLAST done
	scaling done
	prediction done
	confidence index computed
	flexibility prediction done
   and everything must conclude with:
	End of loops !!!

9- I see many options for psi-blast. Which ones can I change?

   The tools are provided as it. You can do whatever you want. But as said the great contemporary philosopher MC Hammer, U Can't Touch This . SVM parameters have been optimized using specific parameters for BLAST, if you change it, you change the way the prediction will be done. Don't tell us after this .

10- I still have questions

Please contact us, see contact page.

Some references

  1. Bornot A., Etchebest C., de Brevern A.G.
    A new prediction strategy for long local protein structures using an original description
    Proteins (2009) 76(3):570-87.

  2. Bornot A., Etchebest C., de Brevern A.G.
    Predicting Protein Flexibility through the Prediction of Local Structures.
    Proteins (2011)79(3):839-52.

  3. de Brevern A.G.*, Bornot A.*, Craveur P., Etchebest C., Gelly J.-C.
    PredyFlexy: Flexibility and Local Structure prediction from sequence
    Nucleic Acid Res (2012) 40:W317-22.
    *: authors contribute equally.

  4. Narwani T.J.*.,Etchebest C.*, Craveur P., Leonard S., Rebehmed J., Bornot A., Gelly J.-C., de Brevern A.G.
    in silico prediction of protein flexibility with local structure approach
    submitted (2016) .
    *: authors contribute equally.




      Alexandre G. de Brevern
      Last Modification : July 2016
      Paris7 Inserm INTS