CML (Chemical Markup Language) Update
- From: Peter Murray-Rust <pm286 #*at*# cam.ac.uk>
- Subject: CML (Chemical Markup Language) Update
- Date: Thu, 16 May 2002 17:54:48 +0100
CML is an Open XML infrastructure which supports molecular information and
is compliant with current W3C (World Wide Web Consortium) protocols and
philosophy. CML is designed as an application-independent adapter for
chemical information. This is to announce a number of new developments.
(Much of the work is jointly with Henry Rzepa).
(A) Specifications: The W3C released the XML Schema Recommendation
year and we have converted the current V1.0 DTD to a schema. This is
at a draft stage, but increases the power of the language while
some of the syntax. W3C schemas are seen as the main way forward for
XML applications and will support a variety of powerful new tools like
forms, XML query, etc. In addition we have developed a core XML
(STMML) for representing scientific data (such as data structures
matrices), many data types, scientific units, metadata, etc.). STMML is
core part of CML but can be re-used in other applications.
(B) Programming: We have created A W3C DOM (Document Object Model) for
This forms an Open abstract data model (and API) for those writing
molecular applications in XML (and other OO approaches such as UML).
CML DOM has been developed alongside the Life Sciences Research group
specification for small molecules for the Object Management Group. We
also created SAX2-based modules for CML. These interfaces are available
under Open Source license for any developers.
(C). Data: We are working closely with the National Cancer Institute
who are converting their database to CML.
(D) CML is now an integral part of several Open Source projects such
OpenBabel, JMOL, JChemPaint and XDrawChem (all on http://www.sourceforge.net)
(E) A new resource has been set up at Source Forge, the Open Source
repository. Project Page: http://cml.sourceforge.net; Downloads and CVS
repository at http://www.sourceforge.net/projects/cml. There is much
toolkits. We have developed a C++ SAX2-like parser for OpenBabel:
(F) SELFML: The SELF project (Prof. Henry Kehiaian, Paris) has created
ontology for physicochemical data (especially properties of molecules
mixtures of molecules). SELFML is the XML incarnation of SELF and
the data and the specifications (dictionary entries). SELFML
with CML and can therefore support collections of molecular properties
(catalogues, dictionaries, etc.)
(G) CML is being extended to support reactions and computational
We are starting to convert codes (initially Open Source such as MOPAC
GROMACS) to read and write XML, and to develop the XML ontologies
required. Offers of collaboration welcomed :-)
Peter Murray-Rust, pm286 AT cam.ac.uk
Unilever Centre for Molecular Informatics, Chemistry Department
Lensfield Road, Cambridge, CB2 1EW, UK