SMILES Summary
SMILES CODE FOR DESCRIBING MOLECULAR STRUCTURES
I received about 20 responses to my request for information on Smiles Code.
Most of the responses can be summarized as follows:
1. Smiles is an easy to use code for describing a chemical structure as
a string of text. For simple molecules without rings, the text string is
similar to the usual chemical structure, written as a line of text, with
hydrogen atoms omitted. The Smiles code can be read as input by
computer programs and chemical databases.
2. Excellent tutorials and information on Smiles Code can be obtained
from the web site: http://www.daylight.com
3. An excellent journal reference, with clearly explained rules
for Smiles Code, is
"SMILES, a Chemical Language and Information System. 1. Introduction
to Methodology and Encoding Rules" by David Weininger
J. Chem. Inf. Comput. Sci. 1988, Vol 28, pages 31-36
However, just as the chemical structure for a chemical compound can often
be represented in many ways, the Smiles code is often not unique. A later
paper describes a method for the unique generation of Smiles code. However
that paper discusses computer algorithms and will not be of interest to a
chemist who wants to know the basic encoding rules. The reference is
"SMILES. 2. Algorithm for Generation of Unique SMILES Notation"
David Weininger, Arthur Weininger, Joseph L. Weininger
J. Chem. Inf. Comput. Sci. 1989, Vol 29, pages 97-101
SPECIFIC COMMENTS AND EXAMPLES OF SMILES CODE
From: jsb2 (- at -) camsoft.com
Subject: Re: CCL:Smiles Code
SMILES is a line notation for chemical structures. It was developed by
Daylight and lots of information is available from their site
(http://www.daylight.com).
Basically, single bonds are implied by default,
and hydrogens are implicit, so CCC is propane. Branches are shown by
parentheses: CC(O)C is isopropanol. Double bonds are equals signs: CC=CC is
2-butene. Ring closures are shown by matching numbers: C1CCCCC1 is
cyclohexane. The rules get more complicated, but that's the general idea.
SMILES is a very compact way to store chemical structures in a textual
form.
CS ChemDraw Net is freely available from
http://www.camsoft.com/chemfinder/download.html and will
allow you to create
SMILES strings for most any structure you can draw. You can use SMILES
strings to search WWW databases such as the one at
http://chemfinder.camsoft.com (and they of course have many
other non-WWW
uses as well)
Jonathan Brecher
CambridgeSoft Corporation
jsb (- at -) camsoft.com
From: Sjors Wurpel <sjorsw (- at -) org.chem.uva.nl>
A SMILES string is a way of describing a chemical structure in a line of
text. It can be created with e.g. CSC ChemDraw package (copy SMILES). It
looks like this:
trans-2-amino-cyclohexanol = [NH2]C1C([OH])CCCC1
From: "J. Eric Slone" <eslone (- at -) erols.com>
SMILES is simplified molecular input line entry system... there is
an on-line guide on the web.
Examples are CC(=O)O for acetic acid and c1ccccc1 for benzene.
From: Alan Shusterman <Alan.Shusterman (- at -) directory.Reed.EDU>
SMILES is a language that was invented by David Weininger (DAYLIGHT Inc.).
The language provides a simple means for writing complex molecular
structures as a one-line code, and to have a computer recognize the code,
e.g., the SMILES coding of most molecular formulas is not unique, but
Weininger was able to find an efficient way to compare and recognize
different SMILES for the same molecule, and to use this as a database
key for information about the molecule.
See: http://www.daylight.com OR e-mail: info (- at -)
daylight.com
Alan Shusterman
Department of Chemistry
Reed College
Portland, OR 97202
From: "Gregory L. Durst - DowElanco R&D" <GDURST (- at -)
elvax2.dowelanco.com>
SMILES stands for "Simplified Molecular Input Line Entry System",
see the orig paper ...
D. Weininger, "JCICS", vol28, (1988), 31-36.
For more information contact:
Daylight Chemical Information Systems
phone: 714/367-9990 (Mission Viejo, CA)
web url: http://www.daylight.com
From: tj ODonnell <tj (- at -) eecs.uic.edu>
SMILES is a line notation to represent chemical structures on
computers. It was invented by Dave Weininger (now of Daylight, Inc.)
and is used by lots of folks in molecular computing.
Try looking at www.daylight.com for information.
More specifically:
http://www.daylight.com/dayhtml/smiles/
From: D.Winkler (- at -) chem.csiro.au (Dr. Dave Winkler)
Subject: Re: CCL:Smiles Code
SMILES is a very compact, very intuitive way of representing any molecular
structure. There is a good tutorial on the Daylight page:
http://www.daylight.com/dayhtml/smiles/
OTHER SUGGESTED SITES FOR INFORMATION AND TUTORIALS
From: Wolf-Dietrich Ihlenfeldt <wdi (- at -)
schiele.organik.uni-erlangen.de>
Dr. Wolf-D. Ihlenfeldt
Computer Chemistry Center, University of Erlangen-Nuernberg
Naegelsbachstrasse 25, D-91052 Erlangen (Germany)
Tel (+49)-(0)9131-85-6579 Fax (+49)-(0)9131-85-6566
http://www.daylight.com/dayhtml/smiles/smiles-intro.html
http://schiele.organik.uni-erlangen.de/services/smiles.html
From: Robert Fraczkiewicz <robert (- at -) pauli.utmb.edu>
http://schiele.organik.uni-erlangen.de/services/smiles.html
From: Soaring Bear <bear (- at -) ellington.pharm.arizona.edu>
My chemistry web page has three links to SMILES tutorials on the web:
http://fox.pomona.claremont.edu/chem/SMILES/index.html">
pomona </a> -
http://www.daylight.com/dayhtml/smiles/smiles-intro.html">
daylight </a> -
http://schiele.organik.uni-erlangen.de/services/smiles.html">
schiele </a> -
From: Bill Laidig <laidig (- at -) pg.com>
http://fox.pomona.claremont.edu/chem/SMILES/index.html
http://fox.pomona.claremont.edu/chem/SMILES/index.html
Bill Laidig
The Procter & Gamble Co. tel 513-627-2857 fax - 1233
Miami Valley Laboratories laidig (- at -) pg.com (preferred)
P.O. Box 538707 wd_laidig (- at -) pg.com
Cincinnati, OH 45253-8707 laidig (- at -) qtp.ufl.edu
From: DOUGH (- at -) mdli.com
I am sure you'll get lots of replies explaining what SMILES is - what might
be more useful to you is a molecule file converter program called CONSYSTANT,
from Exographics, which can convert to and from about 30 different widely-
used file formats, including SMILES. You can get more info by searching
for Exographics on the Web, or contact
ExoGraphics
144 Pinecliff Lake Dr
West Milford, NJ 07480
(201) 728-0188
76070.726 (- at -) compuserve.com
>From Vernon Walatka (the author of this post)
I generated Smiles code for the following three compounds.
Titanium dioxide
[Ti](=O)=O
Aluminum hydroxide
[Al](O)(O)O
Zinc stearate
CCCCCCCCCCCCCCCCCC(=O)(O-).[Zn+2].CCCCCCCCCCCCCCCCCC(=O)(O-)
Note that the symbol "C" appears 18 times in each of the above two
branches for zinc stearate.
Vernon Walatka, Ph.D.
Allen Research Center
Quantum Chemical Company
11530 Northlake Drive
Cincinnati, OH 45249
Voice (513) 530-4184
FAX (513) 530-4206
e-mail 62812142 (- at -) eln.attmail.com (preferred) or vvw (- at -)
dialup.oar.net