CCL: Summary of Program to search SD/MDL formatted files from chemical
- From: "Donald Keidel"
- Organization: UCR
- Subject: CCL: Summary of Program to search SD/MDL formatted files
from chemical companies
- Date: Mon, 3 Nov 2003 15:15:43 -0800
Thank you all for all your responses and all your help. I greatly
appreciate everything. Have a great day.
The chemoinformatics toolkit CACTVS has very powerful substructure
search capabilities. It is free for academic use. See
We have, e.g., based our online Enhanced NCI Database Browser
(http://cactus.nci.nih.gov/ncidb2/) on it. We are using
search in SD files numbering in the millions of compounds (although you
get a huge additional speedup if you first convert them into the
internal CACTVS format). We are working to put a more complete CACTVS
documentation together - if you are interested, feel free to contact me.
For all other questions, please contact the author of CACTVS,
Wolf-Dietrich Ihlenfeldt (wdi|at|xemistry.com).
If you need a chemically intelligent database solution that can also be
used to build an intra- or internet application, please check out JChem
You can try it here: http://www.jchem.com/examples.html
JChem can be downloaded from the site. An example application that can
be easily customized is included in the package.
The software is in Java so it is portable. It can be connected to
several databases, like MySQL, PosgreSQL, Oracle, etc.
If you look for a free substructure search program, try obgrep which is
a tool available at the open babel sourceforge web site
The program uses daylight SMART chemical language to perform the query
Unfortunately, the program is not in the current release (1.100.1) for
the moment, but you can get it by anonymous access to the CVS
the Omega from Openeyes (http://www.eyesopen.com) might do what you ask
looking for, and is free for academics, as far as I know.
The CACTVS Chemoinformatics toolkit, which contains a full and
competitive substructure search engine, is free for academic use and can
be downloaded from www.xemistry.com/academic. This engine drives the NCI
database (http://cactus.nci.nih.gov/ncidb2). This is not a graphical
environment, but a script tool (but because of this far more powerful
than ISIS/Base - it reads ISIS query files and Daylighgt SMARTS, though)
not sure what exactly you mean by "to search SD/MDL formatted files".
Our open source java library for structural chemo- and bioinformatics,
The Chemistry Development Kit (CDK)  can read SD/MDL files and
perform a number of operations on it, like searching for a particular
substructure, etc. Other operations, like calculating certain
descriptors, can be quickly implemented.
The CDK lives on http://cdk.sourceforge.net.
Feel free to contact the CDK development team on the
cdk-devel|at|lists.sourceforge.net mailing list to discuss whatever
CDK-related question you have.
CACTVS is a freely available chemical extension of the Tcl scripting
There are some commands to allow you to search chemical database files
(either within a flat file or from a DBMS).
Although a powerful language, the main drawback is the lack of
Consequently, I am attaching a PDF of the beginning of a manual I was
trying to create from notes of some of our group members here.
I warn you that it is incomplete (and on hold indefinitely, as
priorities have shifted) and may not be totally accurate from a
conceptual viewpoint. However, that said, something is better than
nothing so... Also, towards the end of the manual I have just added the
structure searching notes verbatim.
Hope the information helps,
MarvinView can search SD files <http://www.chemaxon.com/marvin/>
Hope this helps