|
574. PFIP: Portable FORTRAN Implementation of PRISM
by Millard H. Lambert and Harold A. Scheraga,
Department of Chemistry, Cornell University, Ithaca,
New York 14853
PFIP uses the amino-acid sequence of a protein to find
the most probable chain conformations of the protein.
Residue conformations are given in terms of a four-
state model; chain conformations are represented as a
sequence of single-residue states.
PFIP uses pattern-recognition techniques to predict the
approximate conformation of a protein chain from the
amino-acid sequence. A complete description of the
theory is given in a series of three papers submitted
to the Journal of Computational Chemistry 1-3.
Single-residue conformations are represented in terms
of four conformational states: a, e, a* and e*.
These states are defined by regions in the f, y map,
and a precise definition is given in the first paper of
the series1. The a-state occurs in the right-handed
a-helix, the a* state occurs in the (rare) left-handed
a-helix, and the e-state occurs in extended chains and
in the b-sheet. The e*-state does not occur in any
common element of secondary structure.
The conformation of the entire chain is represented by
a sequence of single-residue conformational states; the
distinct conformations in the representation are called
"chain-states." The prediction calculation involves
two steps. First, pattern-recognition techniques are
applied to the amino-acid sequence to compute
tripeptide conformational probabilities. Then, the
tripeptide probabilities are used to compute chain-
state probabilities, and a search procedure is
introduced to find the most probable chain-states.
The use of probabilities in the first step is crucial
to the success of the procedure. The pattern-
recognition procedure cannot make single-residue
predictions with 100% accuracy; consequently, the most
probable chain-state will almost always contain
numerous errors. However, one or more of the other
highly probable chain-states may be similar or
identical to the native conformation.
PFIP requires two input files. The first (which must
be connected to FORTRAN unit 7) contains the pattern-
recognition parameters that were derived from an
analysis of protein structures in the Brookhaven x-ray
data bank. These parameters are discussed in the first
paper of the series1.The pattern-recognition
parameter file is the second data file on the
distribution tape and should not be modified by the
user in any way. Using the CMS operating system on IBM
hardware, the parameter file may be connected for
FORTRAN unit 7 with a filedef statement.
The second input file (which must be connected for
FORTRAN unit 8) contains the amino-acid sequence of the
protein, as well as several parameters. This file must
be supplied by the user.
Restrictions: The protein may have no more than 200
residues; PFIP is limited to 500 chain conformation in
the probability directed search calculation.
_________
References
1. M. H. Lambert and H. A. Scheraga, "Pattern
Recognition in the Prediction of Protein Structure.
I. Calculation of Tripeptide Conformational
Probabilities from the Amino Acid Sequence," submitted
to J. Comp. Chem.
2. M. H. Lambert and H. A. Scheraga, "Pattern
Recognition in the Prediction of Protein Structure.
II. Chain Conformation from a Probability-Directed
Search Procedure," submitted to J. Comp. Chem.
3. M. H. Lambert and H. A. Scheraga, "Pattern
Recognition in the Prediction of Protein Structure.
III. An Importance-Sampling Minimization Procedure,"
submitted to J. Comp. Chem.
_________
FORTRAN (IBM VS2)
Lines of Code: 10,102
|