Molecular modeling raison d'être is to perform computer simulations, or to do chemical experiments with a computer rather than a laboratory bench. Though the results of such simulations have to be validated by experiment, it is known that in many cases we may rely on them as much as on the true experimental values. On many occasions calculations showed flaws in original experimental data and led to corrected experiments which agreed with the theory. Simulation requires a model, a mathematical description of a system to be simulated. It is important to accept that the particular model need not be ``real'', but good models usually have a sound scientific justification to yield useful results.
This section briefly outlines potential energy functions used in molecular mechanics and dynamics methods. Intelligent use of these methods requires more information than can be given here. An excellent monograph on this topic by Burkert & Allinger (1982) is available. Also the actual software instructions accompanying molecular modeling systems describe the terms in great detail. There are in general two approaches to molecular simulation: quantum methods, and empirical methods. Quantum methods are based on principles of quantum mechanics. The most rigorous implementations of quantum formalism, called ab initio methods, do not require and experimental data, beside a handful of basic physical constants like: speed of light, electron mass, elementary charge, etc. The empirical methods reviewed in this section derive their origin from classical chemical concepts and are based for the most part on experimental observations. This does not mean however that they do not use quantum results and procedures. In fact there are numerous examples of combination, quantum/empirical approaches, where some parts of a system are treated quantum mechanically and others are described by classical mechanics. The boundary between these methods becomes more and more blurred with time.
It is well known from structural chemistry and from quantum calculations that bond lengths and valence angles in typical units and groups are very similar even if they appear in different molecules. The single bond between two sp hybridized carbon atoms is around 1.5 Å long and the valence angle for an sp3 carbon is usually close to 109 . On the other hand, these values are frequently distorted in some strained ring systems (like cyclopropane) or crowded molecules (like tetramethylmethane). The concept of molecular strain can be imagined as if bonds were elastic springs whose distortion is reflected by a positive increment (i.e., thermodynamically unfavorable) to the potential energy of the molecule. These approaches were originally called empirical potential energy functions but the term molecular mechanics is now in common use. The mathematical form of this potential energy function (also called potential energy surface) is:
where represents the potential energy of the molecular system and depends upon the cartesian coordinates of all atoms denoted here as . 's are individual energy terms representing contributions from individual interactions which depend upon the cartesian coordinates of partaking atoms. Having we can calculate the potential energy of the molecular system as well as the forces acting on atoms and groups. Individual energy terms, , are functions of coordinates (usually internal coordinates) and parameters, i.e., constants, called force field parameters. The set of such parameters and the functional form of these terms is called a force field. In a good force field, parameters are well balanced and produce consistent results. On the other hand, there are many different force fields which differ substantially in the values of corresponding parameters and in the functional form of the energy terms.
Merging parameters from
different force fields is therefore discouraged.
While the internal coordinates can easily be
calculated from the cartesian coordinates of the atoms
of the actual molecule, the parameters entering into the
energy function must be known in advance for all types of atoms comprising
the molecular system. As will become clear from the following discussion, many
parameters are needed to describe even simple biological molecules.
For some more exotic atom types, such parameter are still not available.
You can always put some number
in place of the missing force field parameter, however, your results will be
as reliable as these parameter values. There is a growing tendency to
estimate needed force field parameters from ab initio quantum
calculations, by performing such calculations on small, model molecules
assuming that the results are transferable to larger molecules. This approach
has the added advantage that calculations can be performed for all, even
non-existing species, while reliable experimental data are only available
for the most popular molecules.
Individual terms can be grouped into three classes:
The bond stretching term, , for a covalent bond between atoms i and j, represents a contribution to the potential energy resulting from deformation of the optimal bond length. Most frequently a simple symmetric parabola (i.e., harmonic approximation) is used:
where is a bond stretching parameter (frequently called a bond stretching force constant) and its value represents the stiffness of the bond (i.e., the ease with which the bond can be distorted). Large values of correspond to ``hard'' bonds and small values to ``soft'' bonds. The value of corresponds to the optimal (i.e., unstrained) bond length. Contribution from this term is never negative and is equal to zero only when the actual bond length, , is exactly equal to the optimal bond length . The actual bond length is calculated from the cartesian coordinates of atoms i and j as in eq. 6.14. For accurate calculations more elaborate expressions for the bond stretching term are used, e.g., a Morse potential, which takes into account the fact that the potential for bond deformation is not symmetric and it is easier to stretch the bond than to squeeze it.
The form of the angle bending term is very similar. In most cases the contribution to potential energy, , is represented by a harmonic expression depending upon the bending force constant, , the actual value of the valence angle and its ``natural'', optimal value, :
The out-of-plane term is used to account for the energy contribution from distorting an aromatic or conjugated system of bonds from planarity and is most frequently a harmonic term:
where in the case of sp hybridized carbon, is an angle between one of the bonds originating at the central atom and a plane passing through the other two bonds.
The functional form of the torsional term is unique. In most cases, it depends on the cosine of the product of a torsional angle and the periodicity (see Fig. 6.21):
where is the height of the torsional barrier; s is -1 if a minimum occurs for the eclipsed conformation and +1 if a minimum corresponds to the staggered conformation; n is a periodicity, i.e., the number of maxima per full revolution; and is a torsional angle. For the torsional angle involving C--C bond in ethane, s=1 and n=3, while for the double bond in ethylene s=-1 and n=2. More elaborate force fields use as a torsional term the sum of several cosine functions with different periodicities to account for smaller ``humps'' appearing on the torsional energy dependence. On occasion, the harmonic potential is used for torsional angles around double bonds due to their stiffness.
Figure 6.21: Plot of torsional energy dependence upon the torsional angle:
.
Non-bonded terms describe contributions brought by the interaction of atoms which are not covalently bonded. Moreover, atoms which are two bonds apart (1-3 interactions), are usually not included in the non-bonded interaction list, since it is assumed that their interaction is satisfactorily accounted for by the angle bending term. However, non-bonded interactions of atoms separated by 3 covalent bonds in the molecular graph (1-4 interactions) are routinely included in spite of the fact that they appear in the torsional term, though their magnitude is frequently scaled by 50%. Generally, the non-bonded interactions represent contributions to energy from van der Waals interactions, electrostatic interactions and they frequently account explicitly for hydrogen bonds. There are many expressions for these terms and only the simplest ones will be described here to illustrate the underlying ideas.
The non-bonded terms are usually represented mathematicaly as two-body interactions. They depend on the coordinates of only two interacting atoms, i.e., are represented as pairwise potentials. This is an approximation, and it is well known that the interaction between two atoms depends significantly upon the positions of other atoms, especially those at close proximity. However, even for pairwise potentials, calculation of non-bonded terms is at present a significant computational effort proportional to the square in number of atoms. For three-body potentials (i.e., functions which depend on positions of three non-bonded atoms), this effort is proportional to the cube in number of atoms and unmanageable for larger molecules. For example, for a small sizes protein containing 1000 atoms, there are roughly pairwise interactions, or terms. The factor comes from the fact that each pair is included only once, i.e., the pair i--j is equivalent to j--i. For the three-body potentials we would have to calculate interaction for each possible combination of three atoms (excluding valence angles), i.e., close to or terms! For that reason, the three-body potentials are not routinely calculated. However, even two-body potentials are computationally demanding and approximations are used. The large number of terms is frequently reduced by omitting the interactions of distant atoms (i.e., skipping calculation of the non-bonded term for atoms which are further apart than some predetermined cut off distance). However, checking if atoms are outside the cut off distance is also a substantial computational effort for larger molecules.
Figure: Dependence of van der Waals energy upon distance between
two atoms approximated by a Lennard-Jones 12-6 potential (eq. 6.29).
Here: m=6, n=12, Å, kcal/mol,
Å
The van der Waals interaction energy, , is composed of two terms opposing each another, a repulsive term , and an attractive term, :
The repulsive term, , rapidly grows at close interatomic distances due to the overlapping of the electron clouds of the two atoms which results in the disruption of their electronic structure. This leads to the exposing of their nuclei and gives rise to a strong repulsion of positive charges. At moderate distances between atoms, i.e., larger than the sum of their van der Waals radii, the dispersion term, , dominates. These attractive forces were first identified by London in 1930, and exist even when molecules have no permanent charge or dipole moment. The dispersion energy is of quantum mechanical origin and cannot be completely described in classical terms. The electrons in an atom are in continuous motion and the electron distribution around the nucleus is constantly fluctuating giving rise to instantaneous dipole moments. Though on the average the dipole moment of an atom is zero, at any given moment there is some temporary dipole moment present which by induction produces a dipole moment of opposing direction in the neighboring atom. The attraction of these instantaneous dipoles results in a dispersion energy. The simplest and most widely used equation approximating this behavior is due to Lennard-Jones. It is illustrated in Fig. 6.22 for two atoms i and j as a function of their distance :
where is the depth of a well (i.e., the maximal attraction energy), n and m are exponents (typically n=12 or 9, and m=6), and is the distance between atoms which corresponds to a minimum. As you can see from Fig. 6.22, the curve climbs abruptly to large values for distances just below the optimal distance . At some distance, the repulsion energy is so large that further approach of atoms is no longer possible. This interatomic distance corresponds to the sum of the van der Waals radii of two atoms, i.e., the van der Waals surface has a strong justification in the form of interatomic forces. On the other side of the minimum, the dispersion energy decays rapidly with distance. Bear in mind, however, that for atoms buried in the center of a large molecule, the number of these distant interactions grows rapidly with distance (because you can place more atoms on the surface of the larger sphere), and the sum of all these minute interactions decays at a slower rate than for the isolated two-atom case.
The electrostatic interaction energy, , between two non-bonded atoms i and j is usually represented by Coulomb's law:
where , are the net atomic charges of atoms i and j, respectively, is the interatomic distance, and k is the dielectric constant of the medium between the interacting charges. Some force fields use the interaction of bond dipoles as an approximation to electrostatic contribution. Strictly speaking, Coulomb's law can only adequately describe the interaction of two point charges in a continuous medium. In our case, however, charges are not point charges and the medium is not continuous. For this reason, the Coulomb's law in its original form is a crude approximation.
The electrostatics of molecules (especially macromolecules) is an important factor in their biological function since it is responsible for long-range inter- and intramolecular forces. These are large forces compared to other interactions, e.g., the interaction energy of two point charges -0.3 a.u. and +0.3 a.u., respectively, separated by a distance of 10 Å in vacuum is kcal/mol, i.e., very substantial for such a large separation between these modestly charged atoms. For this reason, the research in the area of adequate representation of molecular electrostatics, yet computationally feasible, is very active. There are methods which allow one to incorporate polarizability, changes in dielectric constant (i.e., on the protein/solution interface), etc., and these have proved to be very successful in describing long range electrostatic interactions. However, these methods are computationally expensive and for routine calculations simplistic approximations are used. The dielectric constant, k, is a well defined quantity only for macroscopic systems. At the molecular level, a common approximation is to use its value for a vacuum (i.e., k=1). To take into account the presence of other atoms in the molecule or to incorporate the influence of the solvent, different values are used in the range between k=1 (vacuum) to k=80 (water). Another approach is to assume that the dielectric constant is proportional to the distance between charges, usually in a linear fashion, . This reflects the fact that for distant charges there is a higher probability for other atoms or solvent molecules (i.e., polarizable entities) to enter the space between the two charges.
The major ambiguity, however, is with the concept of atomic charge itself. The atom is made up of a nucleus and electrons orbiting around it, and its charge is zero by definition. However, in the molecule, averaged electron trajectories are no longer centrosymmetric around nuclei, due to the formation of chemical bonds between atoms, and the resulting average electron density can be a very convoluted function in three dimensions. Depending on atom electronegativity electrons are on average closer to more electronegative atoms and farther from atoms with low electronegativity. The net effect of this unequal electron distribution can be approximated by placing a number of point charges in such a way that they reproduce the electrostatics of the molecule with reasonable accuracy. Usually, these charges are placed at the positions of atomic nuclei and are called net atomic charges.
There are several ways of deriving atomic charges depending on which molecular property is chosen to be represented best. In many popular force fields they are chosen to reproduce known experimental data, e.g.: dipole moments, geometries, vibrational spectra, etc. They are frequently derived theoretically from a variety of electronegativity equalization schemes and quantum calculations at different levels of sophistication. In the case of quantum chemical calculations, the charges may be derived either from population analysis or fitted by least squares to the electrostatic potential. The charges derived from ab initio electrostatic potential are considered superior but this method is also the most computationally demanding.
The effect of hydrogen bonding is incorporated into the potential energy surface of a molecular system in a variety of ways. In general, hydrogen bonds are formed between hydrogen donor and hydrogen acceptor groups: --D--H...A--AA--. The simplest approach, which at the same time performs very well, is to assume that the bond between hydrogen donor atom D and hydrogen atom H is polarized to the extent that the electron density in the vicinity of the proton is negligible, i.e., that the proton does not contribute to the van der Waals energy. In this model (introduced by Hagler and coworkers), the proton and the acceptor atom attract each another since they possess opposing net charges. This results in the stretching of the D--H bond and driving atom A towards the hydrogen. Balance is achieved by contributions from the D--H stretching term, and electrostatic and van der Waals repulsions of atoms D and A. This approach does not impose any directionality on the hydrogen bonds but the balance of all of these forces usually yiels a geometry close to linear (i.e., when atoms D, H, A and AA lie on the straight line). Some more elaborate models explicitly use powers of and to enforce the linearity of the observed hydrogen bond geometry and incorporate the 12-10 Lennard-Jones potential for the distance between H and A.
The last category of terms which appears in the potential energy function is constraints. Their utility depends on the intended use of the potential energy function and they will be described later in this chapter. The individual energy terms differ in the magnitude of their contribution to the total potential energy of the molecule. The bond stretching and angle bending terms are considered ``hard'' terms, since even a small distortion in the ``optimal'' value of bond length or valence angle produces a very large positive contribution to the potential energy. For this reason, the variation in bond lengths and valence angles is usually negligible in molecules without substantial strain. Larger variations are only evident in strained ring systems or crowded molecules. In fact, the overall shape of the molecule results from a balance between the softer contributions, i.e., torsional terms and non-bonded terms. It is therefore a common practice to freeze bond lengths and valence angles at their optimal values and consider only torsional and non-bonded interactions as the first approximation.