Template Format and Creating New Templates
PROSPECT users can choose either the SCOP Domain Library or the FSSP Chain Library as the template database for threading. PROSPECT uses FSSP as the default template library, since FSSP updates frequently following the PDB release. The FSSP library used in PROSPECT covers PDB structures released before MAY 2002. The SCOP domain library was constructed from the version 1.59 release (15 May 2002).
Each template is contained in a single XML file (this is one of the changes made between Prospect 1 and Prospect 2). We attempted to include the library of all templates that would be nessacary for threading, but of course, 'power users' may wish to create their own templates and thread aginst them.
New templates can be generated from PDB files with the Prospect suite's 'make_template'. This process requires Psi-Blast, the NR database, Makemat, and the DSSP program.
make_template -pdbfile <file> [-c 'Chain Letter']/[-r <start residue> <end residue>] [-n <template name >]
The following enviromental variables will need to be set:
It is also a good idea to have a properly formated HEADER entery, so that make_template can get the ID for the template from the file (or it can be defined by the '-n' flag)
Using Custom Templates:There are two methods that you can use. First, custom templates can be used by threading with the -tempfile flag i.e.
threading.LINUX -phdfile myseq.ss -tempfile custom_template.xml
This method, however, is not suggested, because later tools, such as convertProspect will not be able to find the template. Instead put the template in one of the directories that have been defined in $PROSPECT_PATH/data/parameters/template_paths
A good plan would be to put personal custom templates in $HOME/prospect_templates and put templates that you want to share with other users on your system in $PROSPECT_PATH/data/templates_local
The fssp portion of the XML file:
Header: a general description of the template
REM: entry label (REM for remarks and RES for protein residues)
F: flag (1 for sequence only, 2 for an alpha-helix or beta-sheet residue, 3 for an alpha-helix or beta-sheet residue without C-beta coordinates, and 4 for a core residue)
RS: one-letter code of amino acid type (X for residues other than the standard 20 amino acid types).
NUM: residue number in PDB (including possible extra character associated with it).
SS: secondary structure type, using the same convention as in DSSP (e.g., H for alpha-helix and E for beta-sheet).
ACC: Solvent accessible surface area calculated by DSSP.
x-Cb, y-Cb, z-Cb: C-beta coordinates.
x-Ca, y-Ca, z-Ca: C-alpha coordinates.
|Life Sciences Division - ORNL - Disclaimer - Webmaster|