CCL Home Preclinical Pharmacokinetics Service
APREDICA -- Preclinical Service: ADME, Toxicity, Pharmacokinetics
Up Directory CCL May 16, 1995 [008]
Previous Message Month index Next day

From:  shep \\at// appsdiv.cray.com (Shepard Smithline)
Date:  Tue, 16 May 95 14:04:58 CDT
Subject:  similarity



Dear Netters,

A while back I posted  some questions regarding similarity.  Below
is the original question and the responses I received.  Thanks to
all who responded.

Shep Smithline

__________________________________________________

Original Posting:

Dear Netters,

I have some questions for the similarity gurus out there.

(i)   What are the commonly used (public or commercial) similarity
      programs?

(ii)  What are their major features?

     For example:

     Do they allow a test structure to rotate or translate relative
     to a reference structure or can the internal geometry change?
     Do they perform any additional analyis once the indexes are computed?

(iii) What sort of data do they use to compute the similarity?

      For example:

      Do they compute indexes based only on volume or shape?
      Do they use charges or other quantum mechanically derived
      data to compute an index?

Please foward responses directly to me. I will summarize to the net.

Thanks,

Shep Smithline

_______________________________________________________________________

Responses:



Dear Shep
I have a method of similarity searching based on the method of "icosahedral
matching".  The algorithm is described in reference (1). It was taken up by
workers at Organon Oss (reference 2) who developed a 3D database seaching
program called SPERM. Subsequently and independently I amplified the original
method into a series of programs all of which use the library of subroutines
but the different programs do different things:-

THREEDOM 	3D-database searching
COMPARISONS 	Compares a single structures with many structures
CORRELATE	Compares structures in one series with those in another series
		(the two series may be the same)

The feature of icosahedral matching is that it makes use of the symmetry
of the icosahedral group to increase the efficiency of the matching process
(over brute-force methods) by the factor of 60 (120 if mirror image searching
is included).

The programs use the ideas of Dean (reference 3) on gnomonic projections to
handle the properties being compared. These properties can either be shape
(e.g. distances from points on an encompasing sphere to the nearest atoms)
or electronic (potentials at points on the encompasing sphere), and in
principal other properties could be used (e.g. hydrophobicity).

The icosahedral matching algorithm takes care of the business of rotating
one of the pair of structures being matched with respect to the other.  It does
not incorporate any translational operation.

Because the icosahedral matching process does not take into account the
overall size of structures being compared, it is necessary in database
searching to apply some prefiltering, to avoid comparing grossly disimilar
structures.  Accordingly there are two other programs:- PREFILTERS and
QUICKSCAN. The first of these will produce an index file for the database for
the parameters:- volume, size of largest axis, and ellipticities (i.e. ratios
of three principal axes), for all structures in the database.  The second
program will extract just those structures from the database which meet
criteria of similarity (+/- percentages of volume, axis size, ellipticity)
to the target structure). These extracted structures are then submitted to the
program THREEDOM.

All of these programs are available from QCPE as part of the INTERCHEM package
for quite modest fees.

References:
(1) P. Bladon J. Mol. Graphics, 1989,7,130-137.
(2) V. J. van Geerestein, N. C. Perry, P. D. J. Grotenhuis, and C. A. G.
    Haasnoot, Tetrahedron Computer Methodology, 1990, 3, 595-613; N. C. Perry
    and V. J. van Geerestein, J. Chem. Inf. Comput. Sci., 1992, 32, 607-616.
(3) P.-L. Chau and P. M. Dean, J. Mol. Graphics, 1987,5,97; P. M. Dean and
    P.-L. Chau, J. Mol. Graphics, 1987,5,152; P. M. Dean and P. Callow, J.
    Mol. Graphics, 1987,5,159; P. M. Dean, P. Callow, and P.-L. Chau, J
    Mol. Graphics, 1988,6,28.

Yours sincerely

Peter Bladon

Phone/Fax +44-(0)141-776-1718
email cbas25 %! at !% vaxa.strath.ac.uk

_________________________________________________________



Shep,

You should be aware of the Oxford Molecular package Asp, this is an
implementation of the Carbo method of similarity calculation as developed
by Dr. Graham Richards et al in Oxford University.  The method relies on
the calculation of property overlap integrals and uses either a grid-based
method, or a gaussian approximation.  The latter is significantly faster,
and yields indices which are comparable with the grid calculations.  The
'property' is either shape, charge, lipophilicity or any other
user-supplied potential that may be calculated from atom-centred values.

Similarity indices may be calculated either for;

1) fixed orientations,
2) rigid rotation/translation
3) flexible (bonds).

In all cases the 'lead' molecule is fixed and the comparison is optimised.

The software runs on SGI, IBM and HP workstations, and presents the user
with a spreadsheet-based GUI.  Alternatively, Asp may be run via the OM
product Tsar, which is an integrated package (molecular spreadsheet) for
QSAR analysis.

Further details may be obtained from any OM office, or via the OM WWW pages at;

http://www.oxmol.co.uk/

Hope this helps,

Rob Scoffin
===========


Dr. Robert Scoffin       ________________
 Group IT Manager       |      ####      |
&  Product Manager      |  ##   ##   ##  |
Oxford Molecular Ltd    |   ##  ##  ##   |
The Magdalen Centre     |    ## ## ##    |
Oxford Science Park     |     ######     |
Oxford OX4 4GA          |       ##       |
Tel: (0865)784600       |       ##       |
Fax: (0865)784601       |      ####      |
Mobile: (0378) 210813    ----------------


___________________________________________________________________


Dear Shepard,

What kind of similarity are you referring to?
Similarity can be structurally based, such as an RMSD.
Or, similarity can refer to an aggregate representation of a molecule
comprising the number and types of functional groups, presence absence
of rings, etc.

As I am mopre familiar with the latter, Daylight which develops and
markets the MedChem suite of programs is indeed the best for the
latter. Hope this helps...

-mark


******************************************************
*                                                    *
*                                                    *
* Mark A. Zottola                                    *
* markz' at \`dna.chem.duke.edu                            *
* Department of Chemistry                            *
* Duke University                                    *
* Durham, NC 27704                                   *
*                                                    *
*                                                    *
* The fault, dear Brutus, lies not with ourselves,   *
* but rather within our CPUs.                        *
* (with apologies to Shakespeare                     *
*                                                    *
*                                                    *
******************************************************




________________________________________________________________

Hello,

I read your request for similarity methods and I would like to 'submit' the
following information about a superposition method developed in the group of
Prof. J. Kroon at Utrecht. The method is to be published soon.

-------------------------------------------------------------------------------



* QUASIMODI
* Molecular superposition by means of simulated diffraction patterns.
* Patterson (electron) density superposition.


QUASIMODI is a superposition program using similarity index calculation by means
of simulated crystallographical data. Optimization of overlap is performed in
Fourier space; the constraint of overlap is Patterson (electron) density (which
automatically includes steric and electronic factors).

As for now, the input requires atomic coordinate data for two molecules (.mol,
.res files). Internal rotations are not incorporated, so a fixed conformation is
to be supplied. A job file specifies the optimizations to be performed.

Resolution of the Patterson (electron) density description is defined by the
user.

Further use is made of quantum-chemical data which are incorporated in the
program (taken from SHELX).

Automated optimization is performed starting from 12 starting geometries which
are generated by the program. The output contains a list of optimized rotation
and translation parameters for a number of overlays which give rise to maxima in
the similarity parameter space.
Optimized parameters are given with respect to the input geometry. Root Mean
Squared deviations with respect to the input coordinate data (or a user-supplied
molecule) can be determined.
Output of the optimized geometries is possible.

Calculation times are rather fast:
Twelve optimizations are performed in 5-10 minutes at intermediate resolution
(twelve low resolution calculations are performed within two minutes).
The twelve starting optimizations cover the parameter space to be searched well
(in practice it is seen that the number of optimal overlays, i.e. the number of
-local- maxima in similarity index parameter space is less than 12).


Submission of the program to QCPE will follow.

---

I hope this is of any use to you.
Kind regards,

Willem Nissink






________________________________________________________________
J.W.M. Nissink                 |
Utrecht University             |    E-Mail:
Department of Analytical       |    W.Nissink ( ( at ) ) ams.chem.ruu.nl
Molecular Spectrometry         |
P.O. Box 80.083                |    S-mail:
3508 TB Utrecht                |    Poortstraat 14 bis
The Netherlands                |    3572 HJ  Utrecht
Tel. +31.30.536817 / 537500    |    Holland
Fax. +31.30.518219             |
________________________________________________________________



Shep,

	Thought I'd take the opportunity to describe some new stuff I've done
(we're shipping it with Spartan V4.0)...

(i)   What are the commonly used (public or commercial) similarity
      programs?
(ii)  What are their major features?

	Spartan now has a similarity/superposition tool - it aligns a series of
molecules against a user-specified template by maximizing the similarity of 3D
grid-based functions surrounding each molecule.  These can be classical
"functions" like volume or electrostatic potential from formal charge or
quantum-based functions like density, electrostatic potential, etc.  In
addition, the user can import functions calculated outside of Spartan.  It's a
simplex-based optimizer, since that seems to be pretty successful in spanning
problem space (and is less dependent on initial guesses for alignment).
 Naturally, the code can be used as a similarity measure as well.  It does not
allow for internal motion (although it can be mated with conformational
analysis to "emulate" flexible fitting).

	We also have a form of Hopfinger's molecular shape analysis - it will
report the volumes, pairwise volumes and volume similarities for a series of
molecules.

Hope this helps!

Joe


(iii) What sort of data do they use to compute the similarity?

	The MSA code uses a MC/numerical integration scheme to calculate
volume/shared volume.  The similarity/superposition module uses 3D grid-based
functions - as simple as y/n values for VdW-sphere volume to as complex as the
user can calculate.  It uses Spartan's graphics module to calculate the grids
(which permits difference grids, etc) and can import grids from outside
Spartan.
There are a range of error functions (RMS, correlation, Carbo, etc).



--

------------------------------------------------------------------------
Joe Leonard
Wavefunction Inc.
18401 Von Karman, Suite 370
Irvine, CA  92715                       I am a professional...
714-955-2120                                    do not attempt this at home.
714-955-2118 fax
jle -x- at -x- wavefun.com

_____________________________________________________________


Shep,
	The topic of structural similarity is fairly old and diverse now.
Here is an ever so quick low down:

1.	2D methods
	The earliest are topological, and the following are some of the
	approaches:
	-	Fragment keys for substructure search (Pfizer, Willett)
	-	Atom-pair enumeration (Lederle)
	-	'Torsional descriptors' (Lederle)
	-	Maximal common subgraph (Willett)
	-	Topological indices (?)

	The best summary is Willett, "Similarity and Clustering in
	Chemical Information systems," RSP, 1987.

2.	3D methods
	-	Grid or field overlay (Richards), w/ or w/o conformational
		flexibility
	-	Projection ('Sperm')
	-	Atom-triplet enumeration (Lederle, CAS)
	-	Atom-triplet enumeration, with fancy indexing (IBM's "FLASH")
	-	Distance matrix comparison (Willett)
	-	3D Maximal Common Subgraph (Willett)
	-	... And lots more than that...

Enjoy -- Tom Moock, MDLI

______________________________________________________________________




Dear Dr. Smithline

Here is an answer to your query about 'similarity'.  I hope you will summarize
and post the answers you get.

>i)   What are the commonly used (public or commercial) similarity
>      programs?
---- I don't know how many of the programs themselves are available, but do you
realize
how many different methods are available ? There are indeeed very many, ranging
from topological indices ( Randic, Balaban, Keir & Hall and many others ),
surface
topology and homology groups of algebraic topology ( Mezey ), graph theoretical
methods ( Randic, Bayada ), quantum mechanical methods ( Carbo, Richards et al.,
Cioslowoski, Allen & Cooper and others ) and surface shape descriptors ( Bywater
et al.,and various researchers at Scripps: Duncan, Olson, Getzoff, Max). You will
find all
the references to these methods in a forthcoming book :


"~Quantitative Measurement of Molecular Similarity using Shape Descriptors~" in
R. Carb\'{o} (~Ed.~), "~Molecular Similarity and Reactivity~: from Quantum
Chemistry to Phenomenological Approaches~", Kluwer Verlag, 1994.


As to programs that are available, you should ask Oxford Molecular about the
program
ASP. The shape descriptor programs developed by myself and colleagues will be
available one day, but don't expect that to happen too soon, ask me again in
about
6 months from now. But in the meanwhile, you might want to read these papers :

S. Leicester, J. Finney & R. Bywater J. Mol. Graph. (1988) 6 104-108
S. Leicester, J. Finney & R. Bywater J. Mathematical Chemistry (1994) 16 315-341
S. Leicester, J. Finney & R. Bywater J. Mathematical Chemistry (1994) 16 343-365

Good luck !

Robert Bywater

Novo Nordisk A/S
DK-2880 Bagsvaerd
Denmark


----------- End Forwarded Message -----------





Similar Messages
02/21/1996:  Summary: CRYSTAL & all
08/01/1996:  Re: CCL:M:Heat of formation calculation using MOPAC.
08/03/1995:  ACS Chicago - CINF Abstracts    - 29 pages document -
10/01/1993:  torsion of conjugated systems -- summary
04/08/1994:  normal coordinate calculation 
02/28/1995:   conformational isomers
06/08/1993:  undergrad computational chem
02/27/1995:  undergrad computational chem
08/01/1995:  Spin contamination, effect on energy and structure.
06/28/1995:  Re:POSTED RESPONSES: Quantitative assessment of novel ligands


Raw Message Text