Protein Informatics Group
  
Home People Research Publications News
         

PROSPECT Version 2.0:

Introduction
Installation
Quick Guide
Running
Prospect Manager
Input Formats
Templates
Parameters
Configurations
Outputs
References
FAQ
   

Target Sequence File

The sequence file of the target contains one-letter code of residues in the target protein.  Any space, return, or tab in the file will be ignored. Any character not in the standard coding for the 20 amino acids, e.g., X, Z, or 3, will be treated as an unknown residue. The file can be either in a standard FASTA format (see example 1) or in a flexible format (see example 2).
These files are passed to the program with the flag '-seqfile'.

PHD Secondary Structure Prediction

PROSPECT allow users to include a secondary structure prediction to identify the most probable loop positions on the target sequence or to find a fold whose secondary structures are most compatible with the prediction, with or without other energy terms. The secondary structure prediction can be obtained from the on-line server PHD developed by Burkhard Rost, or by the tool prospect_ssp.  PROSPECT can read both the old PHD output format (see example 1) and the new PHD output format (see example 2).  The information used by PROSPECT are pointed in the following, where "Rel" gives the reliability index (0-9) of prediction, and "prH", "prE", and "prL" give the "probability" (0-9) for assigning helix, strand, and loop, respectively:

              2...,....43...,....44...,....45...,....46...,....47...,....4        
AA |ASFDVWDLLPFTRGYVHILDKDPYLHHFAYDPQYFLNELDLLGQAAATQLARNISNSGAM|
PHD | EEE EEEEEE HHHHHHHHHHHHHHHHHHH H|
Rel |422113213179846999847925566567433112313367899999999999733411| <--
detail:
prH-|111011100000000000000042222111233343345578899999999998763245 <--
prE-|233445543410037999861000000111110111100000000000000000000000 <--
prL-|655433345479862000128956777677655435543321100000000000135654 <--
subset: SUB |..........LLL.EEEEE.LL.LLLLLLL..........HHHHHHHHHHHHHHH.....|

These files are passed to the program with the flag '-phdfile'.

Profile

To enhance the threading performance, one may employ a profile (frequency matrix) derived from a multiple sequence alignment in the protein family of the target.
These profiles are derived from the 'checkpoint files' of Psi-Blast searches.  

There are two essential formats that prospect understands, '.chk' file and  '.freq' file, which are essentially the same file in different forms.  PsiBlast produces a '.chk' file which is a binary file comprised of the sequence discreption, followed by the profile.  We have also included a 'read_chk' program which takes this binary file format, and outputs the data in the exact same format, but this time in ASCII. The advantage to this is that is is human readable, but more importantly, non-platform dependent.  Binary files are subject to the byte order of the machine they were created on, this can cause incompatability between machines with different byte orders, say Mac and PC.  We now, we'll call the ASCII converstion of checkpoint files 'freq files'

Checkpoint files can be used via the command line argument '-chkfile'.
Frequency files can ve used via the command line argument '-freqfile'.


Template Lists

Sometime you do not want to a search aginst the whole database, but rather just a subset.  Prospect takes a list of templates to run via the -tfile <file> argument.  By default, the template list is $PROSPECT_PATH/data/parameters/fssp.list.  
The format to the template list is simply one template name per line.  Given a template name, for example 1aac, prospect will search for the file 1aac.xml in on of the directories defined in $PROSPECT_PATH/data/parameters/template_paths 

You can also use the scop list by adding the flag -scop.  Or to do all templates in the working template directories, by putting a -all flag.  And to do all the templates in the working directories, that are not part of the FSSP database or the scop database (i.e. the templates that you've created with make_template) put a '-custom' flag.  

-
Life Sciences Division  -  ORNL  -  Disclaimer  -  Webmaster