|CCL MMFF94s Validation Suite|
MMFF94s Validation Suite
(Revised June 1999)
This revision replaces Tom Halgren with Simon Kearsley as the person to contact at Merck; Tom has moved to Schrodinger, Inc. effective June 1, 1999 (E-mail address: email@example.com).
The test molecules for this validation suite consist of 235 structures taken from the larger set of 761 Cambridge Structural Database structures that comprised the MMFF94 Validation Suite. Each contains one or more delocalized trigonal nitrogen atoms and therefore is treated differently by MMFF94s than by MMFF94. The structures were "prepared" as described in the documentation for the MMFF94 Validation Suite. MMFF94s has been discussed in a recent paper: T. A. Halgren, J. Comput. Chem., 20, 720-729 (1999). Because these are MMFF94s-optimized structures, their geometries differ from those employed in the MMFF94 suite. The special out-of-plane and torsional parameter files MMFFSOOP.PAR and MMFFSTOR.PAR used by MMFF94s can be accessed via an Internet browser at http://journals.wiley.com (select "Journal of Computational Chemistry", then "Supplementary Material", then "Volume 20", then the hyperlink for page 720) or at ftp://ftp.wiley.com/public/journals/jcc/suppmat/20/720. The parameter files can also be accessed by ftp at firstname.lastname@example.org; cd to public/journals/jcc/suppmat/20/720.
In addition to input molecular structure files and auxiliary data, the
suite provides output files from computer runs made using Merck's OPTIMOL
molecular-mechanics program and BatchMin 5.5 from Columbia University.
Note: some files are quite large. Before downloading, you may want to check the sizes listed at the end of this document. You may want to retrieve the compressed tar achive of these files, MMFF94s.tar.gz (2.5 MBytes), and unpack it by giving the following UNIX command:
gunzip -c MMFF94s.tar.gz | tar xvof -
The following files comprise the input molecular structure data:
Two formats are provided: mol2, from Tripos, and mmd, the designation used at Merck for BatchMin dat files. We chose these file formats because they are in fairly widespread use and because they allow explicit single and multiple bonds to be designated. Unlike file formats more commonly used at Merck, these formats are limited in that they cannot specify formal-charge information. However, this information, which is identical to that for MMFF94, is included in the MMFF94 Validation Suite.
For the convenience of the user, the mol2 files are presented in two versions. One of these -- MMFF94s_dative.mol2 -- uses dative bonding in tetracoordinate sulfur and phosphorous compounds. This representation, for example, treats a sulfonamide as having four single bonds to a +2 sulfur, two of which come from formally negative terminal oxygen atoms. This is the native representation for OPTIMOL, the host program for MMFF. In contrast, the native BatchMin representation features two double bonds from formally neutral oxygen atoms to a formally neutral sulfur, for a (hypervalent) total of six bonds to sulfur; correspondingly "hypervalent" phosphorous compounds have a total of five bonds to phosphorous. This hypervalent bonding pattern is used in the MMFF94s_hypervalent.mol2 and MMFF94s.mmd files in the validation suite. Note: the atom types in the mol2 files (which were generated by a file conversion procedure developed at Merck) in some cases differ from authentic SYBYL atom types, and therefore should not be relied upon.
Results of the MMFF94 calculations are contained in the following three files:
The MMFF94s.energies file lists the molecule name (CSD refcode), the total MMMF94s energy computed by OPTIMOL, and the BatchMin 5.5 energy. It should be noted that the BatchMin calculations used a locally modified version of the mmff_setup co-process in which mmff_setup was enhanced to handle the full range of hypervalent -> dative bonding conversions encountered in the validation suite; some cases were not properly accommodated in the distributed BatchMin 5.5 and 6.0 code, but all should be properly handled beginning with BatchMin 6.5 (these internal bonding conversions are needed because the mmff-setup code, which was derived from OPTIMOL, assumes dative bonding). In all cases, no cutoffs on nonbonded interactions were employed and a unit dielectric constant was used. As comment records in the MMFF94s.energies file indicate, the OPTIMOL and BatchMin total energies agree to within 0.0001 kcal/mol in all but 5 instances; the largest difference is about 0.0021 kcal/mol. These 5 cases are ones in which a formal ionic charge is shared among three atoms of the same MMFF atom type (e.g., the three nitrogens of a guanidinium group); the single-precision division by 3 in the BatchMin run produces a less precise final partial atomic charge and less accurate total MMFF94s energy (though the differences are of course inconsequential for any practical purpose).
The MMFF94s_bmin.log file contains BatchMin 5.5 output, obtained on a SGI R10000 processor, for single-point energy calculations on input structures read from the MMFF94s.mmd file. This log file partitions the total energy into components such as bond stretching, angle-bending, torsion, van der Waals, and electrostatic. It provides the next level of information beyond the simple compilation of total energies found in the MMFF94s.energies file.
Finally, the MMFF94s_opti.log file contains the output from an OPTIMOL run that employed as input an internal Merck-format data file, MMFF94s.ffd, that contains a superset of the information provided in the file MMFF94s_dative.mol2 (which was created from it). This log file provides by far the greatest amount of validation information. Except that it does not contain information about the MMFF empirical-rule generation procedures, this file provides information equivalent to that given in the corresponding file in the MMFF94 Validation Suite. The OPTIMOL run was also made on a R10000 processor.
This file gives short titles for all of the molecules in the validation suite. Other information in the parent MMFF94 Validation Suite, e.g, information on formal-charge assignments, is the same as for MMFF94 and is not repeated here.
Recommendation and Request
To validate a MMFF94s implementation, it would certainly make sense to choose a subset of the validation suite, to convert the mol2 or mmd input data to another format if necessary, and then to begin by computing and comparing total energies to those listed in the MMFF94s.energies file; if and when differences are found, the component energies can then be compared to those listed in the MMFF94s_bmin.log or MMFF94s_opti.log files. Examination of the detailed interaction listings in the OPTIMOL log file might then be needed to diagnose a problem. Ultimately, the entire validation suite should be checked. As in the case of MMFF94, it is the implementer's choice as to whether to use a dative- or hypervalent-bonding representation for affected compounds, or to support both formats.
We have two requests. The first is that any implementation of MMFF94s be identified simply as MMFF94s, and that the name Merck not be used in product literature or in any other way.
The second request is that any implementation of MMFF94s be explicitly characterized by its authors as to whether it is: (1) partial, or (2) complete. An implementation should not be labeled complete unless it is applicable to all 235 molecules in the test suite and produces total and component energies that match those posted here to within numerical precision. For a partial implementation, published descriptions and product literature should state the degree to which the implementation is applicable to the molecules in the validation suite and the degree to which it produces authentic results for those members of the suite to which it is applicable; a clear statement should also be made as to whether or not the MMFF functional form has been fully implemented, as well as whether or not the MMFF step-down equivalencing protocol for default parameter assigmnent is fully utilized and whether or not the MMFF empirical-rule procedures for parameter generation are faithfully employed.
Papers 1 and 5 present the original derivation of MMFF94, while Paper 6 describes the derivation and performance of MMFF94s.
Paper 7 compares the abilities of MMFF94, MMFF94s, CFF95, CVFF, MSI CHARMm, AMBER*, OPLS*, MM2*, and MM3* (1) to reproduce experimental and theoretical values for conformational energies, and (2) to produce reasonable values and trends for intermolecular-interaction energies and geometries in hydrogen-bonded complexes. Some results are also presented for CHARMM 22. The input data used in evaluating force fields has been posted elsewhere on the CCL archives in the hope that it will help others to test additional force fields. This force-field evaluation suite can be accessed via a web browser at:
File name Size in Bytes ------------------------------------------ MMFF94s.energies 11,456 MMFF94s.mmd 945,083 MMFF94s.tar.gz 2,533,318 MMFF94.titles 18,815 MMFF94s_bmin.log 415,718 MMFF94s_dative.mol2 659,080 MMFF94s_hypervalent.mol2 659,081 MMFF94s_opti.log 9,834,918
|Modified: Fri Jun 25 14:57:04 1999 GMT|
|Page accessed 64002 times since Sat Apr 24 09:24:39 1999 GMT|