Lisa Balbes' Guide to Rational (Computer-Aided) Drug Design February 1992 Conventions: Definitions are indicated by an = . (see also J. Med. Chem. (1985) _28(9)_, 1133-1139) Program names are in all capital letters. Recipe: 1. Determine ligand(s) (=small molecule with known biological effect) or receptor site (=place where the ligand binds and acts) of interest. 2. Build model of active site of the receptor. a. Model protein then focus on active site or b. build pharmacophore (=geometric description of the minimal requirements of the active site, including both shape and charge complementarity to the ligands.) 3. Develop inhibitor structures that fit active site model and dock into the active site model. 4. Quantitate interaction, to predict activity of new compounds. 5. Synthesize compounds; evaluate their activity in the system of interest. 6. Refine (2), repeat (3) - (5). 7. Find effective, safe, synthesizable compound. Patent. Sell. $$$$$$$. Either a) start again at (1) or b) retire to Hawaii. Implicit Assumptions: 1. Receptor shape does not change upon binding or for different ligands. 2. All active analogs are interacting at the same site. ********** step 2a ****************************** Homology Modeling = Knowledge Based Modeling = use homologous proteins (where 3D structure of one is known) to predict structure of protein of interest (whose 3D structure is unknown, but sequence is known) Steps - find homologous sequences (using sequence alignment methods), use residue substitution followed by refinement (minimization or molecular dynamics) to predict structure of unknown protein COMPOSER by Blundell and co-workers Eur. J. Biochem. (1988) _172_, 513-520. Blundell et al. Nature (1987) _326_ 347-352. Review Article. Protein Engineering (1987) _1(5)_ 377-384. Sutcliffe,...Blundell Protein Engineering (1987) _1(5)_ 385-392. Sutcliffe,Hayes,Blundell Not to be confused with: Protein Folding = predict 3d structure from amino acid sequence. (predict both backbone conformation & side chain packing) Current Opinion in Structural Biology (1991) _1_, 224-229. Review. Biochemistry (1988) _27(1)_ 7167-7174. Review, uses of NMR J. Mol. Biol. (1991) _217_, 373-388. C. Lee & S. Subbiah Scientific American (January 1991) 54-63. Protein Engineering = modifying residues of protein to change its specificity _or_ trying to find sequence that will fold into a specific 3D shape. Current Opinion in Structural Biology (1991) _1_, 617-623. Review. J. Mol. Biol. (1991) _220_, 495 - 506. Wilson, Mace and Agard. Nature (1991) _352_, 448-451. Lee and Levitt. Accurate prediction of stability/activity of protein mutants. ********** step 2 ******************************* GRID method of Peter J. Goodford J. Med. Chem (1985) 28,849-857. Original paper, method described. J. Mol. Graphics (1989) 7,103-108. J. Med. Chem. (1989) 32, 1083-1094. Assuming you have the structure of the receptor of interest, this will define the regions favorable for ligand binding. Interaction of small probe with protein of known structure is computed at positions throughout and around macromolecule, resulting in grid array of energy values. Contour surfaces at appropriate energy values, when displayed over protein structure, can locate potential binding pockets. (contour at negative energy = region of attraction between probe and macromolecule). Potential = Lennard-Jones, electrostatic and H-bonding terms Probe = water, methyl, amine N, carbonyl O, hydroxyl Input is a pdb file Output is a list of #s, display part must be user-written. $12,500 US commercial, free to academic (as of 1989). Molecular Discovery Limited, West Way House, Elms Parade, Oxford OX2 9LL, England Telephone +44-993-830385 ********** step 3 ******************************* Notes: Only way to know active conformation for sure is to: 1. See it in an xtal structure bound to the receptor (but is the crystal conf the same as the solution conf?) or 2. Find a rigid analogue with no conf. flexibility and high activity (but are you sure it's binding where you think it is?) ********** step 3 ******************************* Distance Geometry Developed by Crippen and co-workers. Good summary in J. Med. Chem (1986) _29(6)_, 899-906. Original work in refs 14,15,16 of that paper. For a given set of N points, and distance constraints (min and max distance between some Ni and Nj's), generate 3D coordinates for all N's such that all constraints are met. Structures are generated from random initial point, so end up with "monte carlo sampling of conformational space with distance constraints". QCPE program # 590, DGEOM by Jeff Blaney Extension: include all atoms of all molecules in one large distance bounds matrix. Acc. Chem. Research (1987) _20_, 322-329. ********** step 3 ******************************* 3D Database Searching search database of 3D molecule structures for any that fit active site model Reviews in Comp. Chem, vol. 1, 213-263. Martin, Bures, Willett. Emerging Technologies and New Direction in Drug Abuse Research, Ed. R. Rapaka, NIDA Monograph, Row Scientific, Rockville Maryland, 1991. p 62 - 77. R. S. Pearlman CONCORD - rule based program, converts 2D structure to low energy 3D structure Pearlman, Chemical Design Automation News (1987) _2(1)_, 1,5-7. Available from Tripos Assoc. 1-800-323-2960 COBRA - converts 2D structure to "all" 3D low energy confs J. Chem. Inf. Comput. Sci.(1990) _30_, 316. Leach, Dolata, Prout. Software to Search 3D Databases: ALADDIN - Daylight Chemical Information Systems (714-476-0451). MACCS-3D - Molecular Design Ltd. (201-540-9090 Darlene Ortiz) SYBYL/3DB - Tripos (1-800-323-2960) ChemDBS-3D - Chemical Design Ltd. 3DSEARCH program - Sheridan (@Lederle, not being distributed) ********** step 3 (docking) ********************** DOCK Irwin D. Kuntz, Jr. J. Mol. Biol. (1982) 161, 269-288. Finds potential docking sites on proteins of known structure by starting with solvent accessible surface, and filling cavities with overlapping spheres to make binding pockets. Ligands of known structure (found by searching database*) are then automatically docked into this "site". * must also have Cambridge database or same format database Potential is 2 terms - hard sphere repulsions and hydrogen bonding only. Both molecules assumed to be rigid. Extensions - Divide ligand into small pieces, dock separately, then rejoin. J. Med. Chem (1986) 29, 2149-2153. Evaluate for goodness of fit, keep best to examine further. J. Med. Chem. (1988) 31, 722-729. Add second step that examines electrostatic and hydrogen bonding properties of receptor site, displays them and suggests possible structural modifications to the ligand. Probing Bioactive Mechanisms, ACS Symposium Series #413, Chapter 4, DesJarlais, Seibel and Kuntz, ACS, 1989. ********** steps 3, 4 ****************************** GROW method of Moon and Howe (proprietary program) Proteins: Structure, Function and Genetics (1991) _11_ 314-328. Intrinsic activity = resulting from steric/electronic factors, ignoring distribution, metabolism and delivery to active site GROW - Program to generate peptides of a specific length to fit a pre-defined cavity (active site) user defines active _site_ of interest, and places acetyl group as _seed_ for peptide inhibitor growth. Templates of amino acids from library are attached, goodness of fit evaluated, and top scoring structures are retained as templates for the next round of attachments. Growth is N to C, C to N, or alternating (user defined), so each level is one residue longer. Score = - [ E + E + E + E (template) + E (recep) ] vdw es conf solv solv Library of amino acid fragments generated by conformational search and _partial_ optimization. Low energy, non- identical structures retained (100 - 5000 per residue). Final structures evaluated visually and by energy criteria - E = E(complex) - E(unbound receptor) - E(unbound peptide) binding making sure to solvate unbound receptor and unbound peptide. ************* step 4 ************************* Free Energy Perturbation Ann. Rev. Biophys. Biophys. Chem. (1989) _18_, 431-492. Review J. Am. Chem. Soc. (1989) _111_, 8050-8508. Thr => Val, calc & exptl J. Med. Chem. (1989) _32_, 2542-2547. Selective Elimin of Interactions J. Med. Chem. (1991) _34_, 2654-2659. Application to HIV inhibitors J. Comp. Chem. (1991) 12(2), 271-175. Required length of simulation. J. Chem. Phys. (1991) 94(6) 4532-4545. Problems, assumptions. Allows calculation of difference in delta(G) of 2 similar structures, by slowly changing one molecule into another. Growth must be slow enough that system is "always" at equilibrium. a Molecule A, condition 1 <=====> Molecule B, condition 1 /\ /\ || || c|| d|| || || \/ b \/ Molecule A, condition 2 <=====> Molecule B, condition 2 delta(G) for a and b can be calculated, c and d can usually be measured experimentally. b - a = d - c = delta(delta(G)) can be used to check accuracy of calculations. Results are more accurate when changes are electrostatic, not steric. Can be difficult to determine whether errors are from fault in methodology, or from badly parameterized force field Simulations of > 100 ps (or even 200 ps) are needed for precise free energy values Averaging results from shorter runs are not accurate ************* step 4 ************************* Min/MD cycling Friedrich Rippmann and N. Michael Green, personal communication Alternate MD (400 K, 0.2 ps) with MIN (200 steps), ~20 cycles Calculate interaction energy (I) at end of each cycle Average I's Linear correlation with delta(G) GENERAL APPROACHES ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ********************** Quantitative Structure Activity Relationship (QSAR) Acc. Chem. Res. (1986) _19_, 392-400. Acc. Chem. Res. (1969) _2_, 232. Quant. Struct. Activity Relat. (1988) _7_ 18-25. J. Med. Chem. (1991) _34_, 2824-2836. Neural Nets in QSAR Measure observables for series of compounds, then use various mathematical techniques to derive equation where activity is a function of all(some) of these observables. This equation is then used to predict the activity of novel compounds. No 3D information is involved - just exptl parameters Biological Property (activity) = f(observable, measurable properties) where the function is almost always a sum of terms of the form coefficient*property *** QSAR Terms COmparative Molecular Field Analysis (CoMFA) - extension to QSAR to include 3D information Biomedical Technology (Jan or Feb 1992) 80-84, Meyer and McMillan. J. Med. Chem. (1991) _34_, 2338-2343. J. Am. Chem. Soc. (1988) _110_, 5959-5967. Align series of compounds, then calculate steric and electrostatic field for each compound at each point on a grid surrounding the molecule. Use these field points to arrive at an equation as described above - most will disappear, but you will be left with a small set of important regions. These will tell you exactly where to add/remove substituents/charges to increase activity. Partial Least Squares (PLS) - a mathematical method for solving for one equation from a multitude of unknowns. less likely than conventional regression to produce chance correlation if all "signal" concentrated in a few columns, PLS may overlook it Crossvalidation - a mathematical method used to determine whether the equation is generalizable to other sets of molecules Bootstrapping - a mathematical method used to generate confidence limits for each term of the equation. Normal QSAR assumes that the variables are drawn from normal, independent distributions. Bootstrapping does not, assumes that the only thing you know about the variable distribution is the values you actually have. Select random rows from table, generate best model for that data. Repeat several times, saving each model, then combine all of them to generate a final model. (With replacement - some compounds may be used more than once in a single analysis) Things that indicate closer examination of the data is needed: Dramatic decline from "normal" r^2 to a crossvalidated r^2 A high ( >> 0.05) std deviation for a bootstrapped r^2 ********************** Active Analog Approach (marketing rights owned by Tripos Associates) Recognition at active site, not biological potency, is factor to consider. Want to deduce minimal recognition requirements to understand how a diverse set of chemical structures can activate the same receptor. Technique used is manipulation of orientation maps. Must define the essential groups for activity, and all possible conformations of each active compound (thus possible orientations of essential groups relative to each other). Orientation map = each point represents one possible arrangement of essential groups, thus one possible pharmacophore. Intersection of OMaps for all active compounds = only possible pharmacophores Receptor essential volume = volume not available to drugs for binding = Volume of inactive compounds - volume of inactive compounds (volume inactives use, that actives don't) G. R. Marshall, C. Dave Barry, H. E. Bosshard, R. A. Dammkoehler, D. A. Dunn, "Computer Assisted Drug Design", ACS, 1979, ACS Symposium Series #112, E. C. Olson and R. E. Christofferson, editors. ************************************* Simulated Annealing General (mathematical) review in Science (1983) 220, 671-680. Method to find min or max of function that depends on many variables. Run a long dynamics simulation, while gradually lowering the temperature. Energetically excited molecule should then cool into a favorable energy well which corresponds to a local energy minimum in conformational space. Typically do many runs, and observe where final structures are clustering. Similar to FEP in that catastrophic changes in system are avoided (you are essentially always at equilibrium). Therefore error is minimized. ************************************* Molecular Dynamics (MD) Angew. Chem. Int. Ed. Engl. (1990) _29_ 992-1023. Solving the equations of motion for all atoms in a system as a function of time, thus creating a picture of the system as it evolves in time. Longer simulations (~100s of ps) and with explicit water give better results. Temperature dependence of MD Simulations, J. Mol. Biol. (1990) _215_, 430-455. Loncharich & Brooks Effect of Solvation (water) on MD simulations. Chemics Scripta (1989) _29A_, 197-203, Michael Levitt. Chemical Physics (991) _158_ 383-394. Brooks, Steinbach, Loncharich ************************************* Conformational Search Move specified bonds x degrees, generate all possible conformations Types of conformation Search: Systematic: on a rotatable bond by systematically incrementing a torsion angle through a range (typically 0 to 359 degrees). The value of the increment step size) determines the fine/coarse-ness of the search. Constrained: pre-process above results, throwing away high energy, sterically dis-allowed, etc. conformations Torsion Driver: same as systematic, but each conformation is minimized. J. Computer-Aided Molecular Design (1989) _3_, 3-21. ************************************* General Reviews: Topics in Stereochemistry (199?) _20_, 1-85 Ripka and Blaney Methods in Enzymology (1991) _203_, 587-613. Martin Dynamics of Proteins and Nucleic acids, McCammon and Harvey, 1987, Cambridge University Press. Interaction Energies: their role in drug design, Pettitt and Karplus, Topics in Molecular Pharmacology, (1986) _3_ 76-113. J. Med. Chem. (1990) _33_, 883-894. Ann. Rev. Pharmacol. Toxicol. (1987) _27_, 193-213. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% standard disclaimer %%%% Lisa M. Balbes, Ph.D. phone: 919-541-6563 Research Triangle Institute, PO Box 12194 vmail: 919-541-6767, xt 6563 Research Triangle Park, NC 27709-2194 email: balbes@osiris.rti.org - This came directly from a computer and should not be doubted or disbelieved.-