# QM/MM

David Young
Cytoclonal Pharmaceutics Inc.

Various computational methods have strengths and weaknesses. Molecular mechanics can model very large compounds quickly. Quantum mechanics is able to compute many properties and model chemical reactions. It is possible to combine these two methods into one calculation, which models a very large compound using molecular mechanics and one crucial section of the molecule with quantum mechanics. This is designed to give results that have very good speed where only one region needs to be modeled quantum mechanically. This can also be used to model a molecule surrounded by solvent molecules.

## 1 Non-automated procedures

The earliest combined calculations were done simply by modeling different parts of the system with different techniques. For example, some crucial part of the system could be modeled by using an ab initio geometry optimized calculation. The complete system could then be modeled using molecular mechanics, by holding the geometry of the initial region fixed and optimizing the rest of the molecule.

This procedure gives a geometry for the whole system, although there is no energy expression that reflects non-bonded interactions between the regions. One use is to compute the conformational strain in ligands, which is important in determining the possibility of binding. In order to do this, the central portion is removed from the calculation, leaving just the ligands in the geometry from the complete system. Two energy calculations on these ligands are then performed, one without geometry optimization and one with geometry optimization. The difference between these two energies is the conformational strain that must be introduced into the ligands in order to form the compound.

Another technique is to use an ab initio method to parameterize force field terms specific to a single system. For example, an ab initio method can be used to compute the reaction coordinate for a model system. An analytic function can then be fitted to this reaction coordinate. A molecular mechanics calculation can then be performed with this analytic function being used to describe the appropriate bonds, etc.

## 2 Partitioning of Energy

Quantitative energy values are one of the most useful results from computational techniques. In order to have a reasonable energy expression when two calculations are combined, it is necessary to know not only the energy of the two regions, but also the energy of interaction between those regions. There have been a number of energy computation schemes proposed. Most of these schemes can be expressed generally as

E = EQM + EQM/MM + Epol + Eboundary

The first two terms are the energies of the individual computations. The EQM/MM term is the energy of interaction between these regions, assuming that both regions remain fixed. It may include Van der Waals terms, electrostatic interactions, or any term in the force field being used. Epol is the effect of either region changing as a result of the presence of the other region, such as electron density polarization or solvent reorganization. Eboundary is a way of representing the effect of the rest of the surroundings, such as the bulk solvent. The individual terms in EQM/MM, Epol, and Eboundary are discussed in more detail in the following section s.

### 2-1 Van der Waals

Most of the methods proposed include a van der Waals term for describing non-bonded interactions between atoms in the two regions. This is usually represented by a Leonard-Jones 6-12 potential of the form

EVdW = A/r12 - B/r6

The parameters A and B are those from the force field being used. A few studies have incorporated a hydrogen bonding term also.

### 2-2 Charge

The other term that is very widely used is a Coulombic charge interaction of the form

ECoulomb = Qi Qj / rij

The subscripts i and j denote two nuclei, one in the QM region and one in the MM region. The atomic charges for the MM atoms are obtained by any of the techniques commonly used in MM calculations. The atomic charges for the QM atoms can be obtained by a population analysis scheme. Alternatively, there might be a sum of interactions with the QM nuclear charges plus the interaction with the electron density, which is an integral over the electron density.

### 2-3 Describing bonds between regions

If the QM and MM regions are separate molecules, having non-bonded interactions only might be sufficient. If the two regions are parts of the same molecule, it is necessary to describe the bond connecting the two sections. In most cases, this is done using the bonding terms in the MM method being used. This is usually done by keeping every bond, angle or torsion term that incorporates one atom from the QM region. Alternatively, a few studies have been done in which a separate orbital based calculation was used to describe each bond connecting the regions.

### 2-4 Polarization

The energy terms above allow the shape of one region to affect the shape of the other and include the energy of interaction between regions. However, these non-bonded energy terms assume that the electron density in each region is held fixed. This can be a reasonable approximation for covalent systems. This is a poor approximation where the QM region is being stabilized by its environment, as is the case with polar solvent effects.

Polarization is usually accounted for by computing the interaction between induced dipoles. The induced dipole is computed by multiplying the atomic polarizability by the electric field present at that nucleus. The electric field used is often only that due to the charges of the other region of the system. In a few calculations, the MM charges have been included in the orbital based calculation itself as the interaction with point charges.

### 2-5 Solvent reorientation

Many of the methods define an energy function, then use that function for the geometry optimization. However, there are some methods that use a minimal coupling between techniques for the geometry optimization, then add additional energy corrections to the single point energy. In the latter case, some researchers have included a correction for the effect of the solvent molecules reorienting in response to the solute. This is not a widespread technique mostly because there is not a completely rigorous way to know how to correct for solvent reorientation.

### 2-6 Boundary terms

It is sometimes desirable to include the effect of the rest of the system, outside of the QM and MM regions. One way to do this is using periodic boundary conditions, as is done in liquid state simulations. Some researchers have defined a potential, which is intended to reproduce the effect of the bulk solvent. This solvent potential may be defined just for this type of calculation, or it may be a continuum solvation model. For solids, a set of point charges, called a Madelung potential is often used.

## 3 Energy subtraction

An alternative formulation of QM/MM is the energy subtraction method. In this method, calculations are done on various regions of the molecule with various levels of theory. Then the energies are added and subtracted, to give suitable corrections. This results in computing an energy for the correct number of atoms and bonds, analogous to an isodesmic strategy.

Three such methods have been proposed by Morokuma and co-workers. The integrated MO + MM (IMOMM) method combines an orbital based technique with an MM technique. The integrated MO + MO method (IMOMO) integrates two different orbital based techniques. The "our own n-layered integrated MO and MM" method (ONIOM) allows for three or more different techniques to be used in successive layers. The acronym ONIOM is often used to refer to all three of these methods since it is a generalization of the technique.

This technique can be used to model a complete system as a small model system and the complete system. The complete system would be computed using only the lower level of theory. The model system would be computed with both levels of theory. The energy for the complete system, combining both levels of theory, would then be

E = Elow,complete + Ehigh,model - Elow,model

Likewise a three layer system, could be broken into small, medium, and large regions, to be computed with low medium and high levels of theory (L, M, H respectively). The energy expression would then be

E = EH,small + EM,medium - EM,small + EL,large - EL,medium

This method has the advantage of not requiring a parameterized expression to describe the interaction of various regions. Any systematic errors in the way that the lower levels of theory describe the inner regions will be canceled out. The geometry of one region will affect geometry of the other, because this is not a systematic effect. Assuming transferability of parameters, this method avoids any over counting of the non-bonded interactions.

The disadvantage is that the lower levels of theory must be able to describe all atoms in the inner regions of the molecule. The effect of one region of the molecule causing polarization of the electron density in the other region of the molecule is incorporated only to the extent that the lower levels of theory describe polarization. This method requires more CPU time than most of the others mentioned. However the extra should be minimal since it is due to lower level calculations on smaller sections of the system.

## 4 Self consistent method

Bersuker and coworkers have proposed a technique where by the atoms on the boundary between regions are included in both calculations. In this procedure, optimizations are done with each method, using the boundary atom charge from the other method and this is repeated until the geometry is consistent between the levels of theory. They specify that the boundary atom cannot be part of a pi bridge between regions.

## 5 Truncation of the QM region

Molecular mechanics methods are defined atom by atom. Thus having a carbon atom without all of its bonds does not have a large affect on other atoms in the system. In contrast, quantum mechanical calculations use a wave function that can incorporate second atom effects. An atom with a nonfilled valence will behave differently than with the valence filled. Because of this, one must consider the way in which the quantum mechanical portion of the calculation is truncated.

A few of the earliest methods did truncated the atom on the dividing line between regions. Leaving this unfilled valence is reasonable only for a few of the very approximate semiempirical methods that were used at that time.

A number of methods fill the valence of the interface atoms with an extra orbital, sometimes centered on the connecting MM atom. This results in filling out the valence while requiring a minimum amount of additional CPU time. The concern, which is difficult to address, is that this might still result in affecting the chemical behavior of the interface atom or even inducing a second atom affect.

The other popular solution is to fill out the valence with atoms. Usually, H atoms are used, although pseudo-halide atoms have been used also. These pseudo-halide atoms are parameterized to mimic the behavior of the MM atom it is substituted for. These extra atoms are called "link atoms" or "junction dummy atoms". The link atoms are not included in the energy expression used to describe the interaction between the regions of the system. The use of link atoms is somewhat questionable, since they are often not subtracted from the final energy expression and may polarize the QM region incorrectly.

## 6 Region partitioning

The choice of where to locate the boundary between regions of the system is important. A number of studies have shown that very poor end results will be obtained if this is chosen improperly. Any bonds, which are being formed or broken, must be entirel y in the QM region of the calculation. Also, any section changing hybridization should be entirely in the QM region. Furthermore, it is best to keep conjugated or aromatic sections of the system completely in one of the regions. Where second or third atom effects are expected to be important, those atoms should be in the same region of the calculation.

## 7 Optimization

The more recently developed methods define an energy expression for the combined calculation, then use that expression to compute gradients for a geometry optimization. Some of the earlier methods would use a simpler level of theory for the geometry optimization, then add additional energy corrections to a final single point calculation. The current generation is considered to be the superior technique.

## 8 Incorporating QM terms in force fields

Rather than doing several complete calculations with an additional interface, it is possible to incorporate orbital based terms in a molecular mechanics method. The first methods for doing this incorporated simple Huckel or PPP semiempirical models to help describe pi system conjugation and aromaticity. There are also techniques for incorporating crystal field theory or ligand field theory type descriptions of transition metals, which have proven to be difficult to model entirely with molecular mechanics.

## 9 Recommendations

To date there have not been any large-scale comparisons of QM/MM models in which many different techniques were compared against each other and experimental results for a range of chemical systems. There does tend to be some preference towards the use of link atoms in order to insure the correct chemical behavior of the quantum mechanical region. Researchers are advised to consider the physical consequences of the effects which are included or excluded from various methods, as applied to their specific system. It is also prudent to verify results against experimental evidence where possible.

## Bibliography

Introductory descriptions are in
A. R. Leach "Molecular Modelling Principles and Applications" Longman, Essex (1996).
K. P. Eurenius, D. C. Chatfield, B. R. Brooks, M. Hodoscek, Int. J. Quantum Chem. 60, 1189 (1996).
J. Gao, Acc. Chem. Res. 29, 298 (1996).

Reviews of QM/MM are
J. Gao, "Reviews in Computational Chemistry Volume 7" K. B. Lipkowitz, D. B. Boyd, Eds., 119, VCH, New York (1992).
A. Warshel, "Computer Modeling of Chemical Reactions in Enzymes and Solutions" Wiley, New York (1991).
E. Clementi, "Computational Aspects for Large Chemical Systems" Springer, New York (1980).

Papers describing methods that incorporate modified orbitals are
J. Gao, P. Amara, C. Alhambra, J. J. Field, J. Phys. Chem. A 102, 4714 (1998).
V. Théry, D. Rinaldi, J.-L. Rivail, B. Maigret, G. G. Ferency, J. Comput. Chem. 15, 269 (1994).
A. Warshel, M. Levitt, J. Molec. Biol. 103, 227 (1976).

Methods describing methods using link atoms are
M. J. Field, P. A. Bash, M. Karplus, J. Comput. Chem. 11, 700 (1990).
U. C. Singh, P. A. Kollman, J. Comput. Chem. 7, 718 (1986).

Energy subtraction methods are described in
M. Svensson, S. Humbel, R. D. J. Froese, T. Matsubara, S. Sieber, K. Morokuma, J. Phys. Chem. 100, 19357 (1996).
S. Humbel, S. Sieber, K. Morokuma, J. Chem. Phys. 105, 1959 (1996).
F. Maseras, K. Morokuma, J. Comput. Chem. 16, 1170 (1995).

The self consistent method is
I. B. Bersuker, M. K. Leong, J. E. Boggs, R. S. Pearlman, Int. J. Quantum Chem. 63, 1051 (1997).

Methods incorporating orbital based terms in molecular mechanics force fields are described in
P. Comba, T. W. Hambley, "Molecular Modeling of Inorganic Compounds" VCH, New York (1995).
V. J. Burton, R. J. Deeth, J. Chem. Soc., Chem. Commun. 573 (1995).
P. Comba, M. Zimmer, Inorg. Chem. 33, 5368 (1994).
N. L. Allinger, J. T. Sprague, J. Am. Chem. Soc. 95, 3893 (1973).

An expanded version of this article will be published in "Computational Chemistry: A Practical Guide for Applying Techniques to Real World Problems" by David Young, which will be available from John Wiley & Sons in the spring of 2001.