From N.Goldberg _-at-_)tu-bs.de Fri Apr 3 03:22:01 1998 Received: from rzcomm1.rz.tu-bs.de for N.Goldberg ^at^ tu-bs.de by www.ccl.net (8.8.3/950822.1) id DAA08394; Fri, 3 Apr 1998 03:06:12 -0500 (EST) Received: from tu-bs.de (bonding.org-chem.nat.tu-bs.de [134.169.19.161]) by rzcomm1.rz.tu-bs.de (8.8.6/8.8.6) with ESMTP id KAA08794 for ; Fri, 3 Apr 1998 10:06:03 +0200 (METDST) Message-ID: <35249889.9C6305D9 _-at-_)tu-bs.de> Date: Fri, 03 Apr 1998 10:06:34 +0200 From: Norman Goldberg X-Mailer: Mozilla 4.03 [de] (Win95; I) MIME-Version: 1.0 To: "chemistry $#at#$ www.ccl.net" Subject: summary: descriptors for the 'shape' and 'size' of molecules Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Hi, some days ago I posted the following question and received an enormous amount of information - that will keep me busy for a couple of months. :-) Here's the summary! Many thanks to all of those that responded (see below). Your information is highly appreciated. Greetings, Norman ------------------------------------------------------ > Subject: CCL:descriptors for the 'shape' and 'size' of molecules > Hello, > > I am looking for a simple (!?) method to describe the 'size' and > topology of a molecule. > The aim is to establish a correlation between experimentally observed > retention times (by liquid cromatography) and the 'shape' of several > pure hydrocarbons > (bigger species including 'disk' and umbrella-shaped molecules). > > I have no experience in this field of research, whatsoever, so any hints > or comments > would be appreciated. ---------------------------------------------- from: "Hank D. Cochran" The van der Waals volume and surface have been used for this purpose (by Bondi: "Physical Properties of Molecular Crystals, Liquids, and Glasses" Abrams and Prausnitz: AIChE J., 21, 116 (1975)). ------------------------------------------ from: Grunenberg Joerg there is a simple method published by Grunenberg and Herges. Please, have a look at: http://www.tu-bs.de/institute/org-chem/herges/grunpic/log_poct/rm_logp.html ------------------------------------------ from Leif Norskov lnl $#at#$ novo.dk Norman, maybe the descriptors from MSI's Catalyst/Shape would be useful: http://www.msi.com/info/products/modules/catSHAPE.html It basically derives the moments of inertia plus some cross terms (total of 18 descriptors as far as I recall) for each conformer. But it is commercial (expensive) software. If need be I could do the calculation for you (assuming that one can extract the numbers into some ascii format - I have actually never used catShape). ------------------- From Dr. Martin Mueller Internet: http://www.iuct.fhg.de If you're looking for a really simple (!?) method: what about topological indices like connectivity indices or shape indices? If you have 3D-Structures, you could calculate maximum and minimum diameters, and the ratio could be a measure of shape. -------------------------- from: "Qiang, Cui" As a matter of fact, the issue is rather similar to some of the drug design problem, just that the observable is kind of different. But u can certainly borrow some technique such as GA-NN (genetic algorithm-neural network) to set up the correlation. Look at some standard book on NN, and some paper on 2D/3D drug-design ( for instance, I recommand several paper by S. So and M. Karplus, published in J. Med. Chem.; for more info, see http://yuri.harvard.edu/~so) -------------------------------- from John Gunn (gunnj: at :cerca.umontreal.ca) Paul Mezey has written an entire book on this: Shape in Chemistry : An Introduction to Molecular Shape and Topology ------------------------------- from Gerardo Gonzalez | Dpto. de Quimica | gerardo*- at -*karin.fmq.uh.edu.cu In the Center for Pharmaceutical Chemistry (La Havana) was done a M.Sc. thesis precisely on the branch you want to work i.e. prediction of the retention time ( in HPLC ) of some compounds using some indexes developed from graph theory (including topological and topographical indexes) if this work is interesting for you I can contact the authors to obtain a copy of the above mentioned work, related works are perfomed actually at the Bioactive Center of the University of Las Villas under the guidance of Dr. E. Estrada, referee of various Comp Chem. journals and by Dr. Trinajstic. ------------- from David A. Winkler Email: dave.winkler: at :molsci.csiro.au There are many kinds of molecular descriptor which may do what you want. They can range from the very simple molecular indices such as those of Randic and Kier & Hall through to molecular holograms, molecular fields etc. I'm sure there are literature examples of the kind of work you want to do. The lipophilicity of the molecules will probably correlate with the retention time. Some of the simple indices are described on the Web site of my collaborator, Frank Burden (http://www.chem.monash.edu.au/Docs/ChemStaffProfiles/Burden.html). Frank's molecular eigenvalue descriptors (also known as BCUT) may work well. -------------------------------- from Carmen Moure n: There is a professor in my department that has written a couple of books on molecular connectivity. His name is Lemont Kier and the topological descriptors he has proposed are very simple, almost intuitive. His e-mail is: kier:~at~:gems.vcu.edu. The books are: Molecular connectivity in chemistry and drug research / Lemont B. Kier, and Lowell H. Hall. PUBLISHER: New York : Academic Press, 1976. and Molecular connectivity in structure-activity analysis / Lemont B. Kier and Lowell H. Hall. PUBLISHER: Letchworth, Hertfordshire, England : Research Studies Press ; New York : Wiley, c1986. ------------ from "Tamas Gunda" In the Center for Pharmaceutical Chemistry (La Havana) was done a M.Sc. thesis precisely on the branch you want to work i.e. prediction of the retention time ( in HPLC ) of some compounds using some indexes developed from graph theory (including topological and topographical indexes) if this work is interesting for you I can contact the authors to obtain a copy of the above mentioned work, related works are perfomed actually at the Bioactive Center of the University of Las Villas under the guidance of Dr. E. Estrada, referee of various Comp Chem. journals and by Dr. Trinajstic, at the CQF you can contact to Ramon Carrasco, M.Sc. at cqf _-at-_)ceniai.cu or cqf00 _-at-_)infomed.sld.cu ---------------------------- from S. Shapiro toukie %-% at %-% zui.unizh.ch For your _particular_ purposes I suspect that Kier-Hall molecular connec- tivity descriptors should suffice. See Rev. Comput. Chem. 2: 367-422 (1991) and Adv. Drug Res. 22: 1-38 (1992). ------------------ from Gregory L. Durst email: gdurst&$at$&dowagro.com I can point you to the program of Kier & Hall called "Molconn" that calculates topological indexes and would be appropriate for the type of correlations you describe. Their program is available for unix and pc's. Contact is: Dr. Lowell Hall Hall Associates Consulting 2 Davis Street Quincy, MA 02170 USA 617-773-6350 ext 280 A publication/application such as you describe is: L.B. Kier & L.H. Hall, "J. Pharm. Sci.", v68, (1979), 120. There is another topological index program called "Polly" by Basak that runs on unix or pc's. The contact is: Dr. Subhash Basak Center for Water and the Environment University of Minnesota 5013 Miller Trunk Highway Duluth, MN 55811 USA 218-720-4279 email: sbasak $#at#$ ua.d.umn.edu ---------------------------- from: dr. ANDREA ZALIANI E-mail andrea ^%at%^ edith.sublink.org have a look at this - G. Bravi, E. Gancia, M. Pegna, P. Mascagni, A. Zaliani WHIM-MS, new 3D Theoretical descriptors derived from Molecular Surface Properties: a comparative 3D-QSAR study in a series of steroids J.Comp.-Aided Mol. Des. 11,79 (1997) "Nothing shocks me. I am a scientist." Indiana Jones -------------------- from: Randy J. Zauhar, PhD zauhar.,at,.fastrans.net I was talking to my collaborator at U. Missouri yesterday, and was reminded that he has developed QSAR models to predict retention time based on molecular properties! Some of those might explicitly include shape descriptors. His contact info: Prof. Bill Welsh Dept. of Chemistry U. Missouri - Saint Louis wwelsh %-% at %-% jinx.umsl.edu You might send him a message and see if he has references or other info. he could provide. ---------------------- from Dr. John Waite, e-mail: chem8:~at~:york.ac.uk * or Jarry Dodds' molecular volume code may be of use to you. Below I enclodse the comments from this + a table of atomic covalent radii: E-mail Larry if you want a copy of the program. SUBROUTINE MOLVOL c volume.f - volume determination code c c Author: Lawrence R. Dodd c Doros N. Theodorou c Maintainer: Lawrence R. Dodd c Created: March 21, 1990 c Version: 2.0 c Date: 1994/07/22 15:45:51 c Keywords: volume and area determination c Time-stamp: <94/07/22 11:02:23 dodd> c Copyright (c) 1990, 1991, 1992, 1993, 1994 c by Lawrence R. Dodd and Doros N. Theodorou. C---------------------------------------------------------------------C C Plane Sphere Intersections C C---------------------------------------------------------------------C C This program will find the total and individual volume and C C exposed surface area of an arbitrary collection of spheres of C C arbitrary radii cut by an arbitrary collection of planes C C analytically by analyzing the plane/sphere intersections. C C---------------------------------------------------------------------C C Algorithm by: Doros N. Theodorou and Lawrence R. Dodd C C Coded by: L.R. Dodd C C---------------------------------------------------------------------C C Created on: March 21, 1990 C C Phase 1 Completed on: March 23, 1990 C C Phase 2 Completed on: April 16, 1990 C C Phase 3 Completed on: May 17, 1990 C C Phase 4 Completed on: June 5, 1990 C C Phase 5 Completed on: July 26, 1990 C C---------------------------------------------------------------------C C Reference: C C C C "Analytical treatment of the volume and surface area of C C molecules formed by an arbitrary collection of unequal C C spheres intersected by planes" C C C C L.R. Dodd and D.N. Theodorou C C MOLECULAR PHYSICS, Volume 72, Number 6, 1313-1345, April 1991 C C---------------------------------------------------------------------C C Acknowlegement: C C C C LRD wishes to thank his mentor DNT for a stimulating and C C enjoyable post-doctoral experience. C C---------------------------------------------------------------------C C General Notes On Program: C C C C This program has been written with an eye towards both C C efficiency and clarity. On a philosophical note, many believe C C that these ideals are mutually exclusive but in general they C C are not. There are, however, a few instances where one ideal C C has been given more prominence over the other. The comments in C C the program, together with the associated journal article, C C should help to explain any apparent logical leaps in the C C algorithm. C C C C The program was intended to be used as a subroutine called C C repeatly by some main program. In this case the subroutine C C "VOLUME" is called by some main routine which has placed the C C necessary information in common block /Raw Data/. The answers C C are returned in common block /Volume Output/. I must apologize C C for the poor input/output for the program. For example, the C C area/volume of each sphere is not placed in /Volume Output/. C C C C This program was developed on a Sun SPARCstation 330 using Sun C C FORTRAN 1.3.1 (all trademarks of Sun Microsystems, Inc.). We C C have used some of extensions to the ANSI standard including: C C C C o long variable names (i.e., more than six characters) C C o variable names containing the characters '$' and '_' C C o END DO used in place of the CONTINUE statement C C o DO-WHILE used in place of IF-GOTO constructs C C o excessive number of continuation lines in some FORMATs C C o generic intrinsic function calls (e.g., SIN for DSIN) C C o IMPLICIT NONE statement (needed in development) C C C C The advantage of using non-standard FORTRAN is that it makes it C C considerably easier to follow the flow of a program. There are C C no extraneous statement labels in this program that may have C C obscured the logic (not a single GOTO was used). The previews C C of the new F90 standard appear to adopt many of the features C C already implemented in VMS, Sun, Cray, and IBM FORTRAN. C C C C Note that this algorithm is completely parallelizable. C C C C Larry Dodd C C dodd _-at-_)mycenae.cchem.berkeley.edu C C C C Department of Chemical Engineering C C College of Chemistry C C University of California at Berkeley C C Berkeley, California 94720-9989 C C (415) 643-7691 (LRD) C C (415) 643-8523 (DNT) C C (415) 642-5927 (Lab) C C C C dodd _-at-_)mycenae.cchem.berkeley.edu C C doros(+ at +)mycenae.cchem.berkeley.edu C C C C---------------------------------------------------------------------C C Note: C C Plane_Ordering of common block /Debug/ is, as the name C C implies, for debugging purposes only as is routine ORDERING. C C The information contain therein is not necessary for solving C C the sphere plane problem but proved incredibly useful during C C program development. C C---------------------------------------------------------------------C BLOCK DATA C REAL*8 COVRAD,Au COMMON/CRADII/ . COVRAD(105) DATA (COVRAD(I), I = 1, 88) / C H He Li Be + 0.320D0, 0.930D0, 1.230D0, 0.900D0, C B C N O F Ne + 0.820D0, 0.770D0, 0.750D0, 0.730D0, 0.720D0, 0.710D0, C Na Mg + 1.540D0, 1.360D0, C Al Si P S Cl Ar + 1.180D0, 1.110D0, 1.060D0, 1.020D0, 0.990D0, 0.980D0, C K Ca + 2.030D0, 1.740D0, C Sc Ti V Cr Mn + 1.440D0, 1.320D0, 1.220D0, 1.180D0, 1.170D0, C Fe Co Ni Cu Zn + 1.170D0, 1.160D0, 1.150D0, 1.170D0, 1.250D0, C Ga Ge As Se Br Kr + 1.260D0, 1.220D0, 1.200D0, 1.160D0, 1.140D0, 1.120D0, C Rb Sr + 2.160D0, 1.910D0, C Y Zr Nb Mo Tc + 1.620D0, 1.450D0, 1.340D0, 1.300D0, 1.270D0, C Ru Rh Pd Ag Cd + 1.250D0, 1.250D0, 1.280D0, 1.340D0, 1.480D0, C In Sn Sb Te I Xe + 1.440D0, 1.410D0, 1.400D0, 1.360D0, 1.330D0, 1.310D0, C Cs Ba La + 2.350D0, 1.980D0, 1.690D0, C Ce Pr Nd Pm Sm Eu Gd + 1.650D0, 1.650D0, 1.640D0, 1.630D0, 1.620D0, 1.850D0, 1.610D0, C Tb Dy Ho Er Tm Yb Lu + 1.590D0, 1.590D0, 1.580D0, 1.570D0, 1.560D0, 1.560D0, 1.560D0, C Hf Ta W Re + 1.440D0, 1.340D0, 1.300D0, 1.280D0, C Os Ir Pt Au Hg + 1.260D0, 1.270D0, 1.300D0, 1.340D0, 1.490D0, C Tl Pb Bi Po At Rn + 1.480D0, 1.470D0, 1.460D0, 1.460D0, 1.450D0, 0.000D0, C Fr Ra + 0.000D0, 0.000D0/ C DATA (COVRAD(I), I = 89, 105) / C CMK92 Using the Lanthanides' values is probably the best approximation C Ac + 1.690D0, C Th Pa U Np Pu Am Cm + 1.650D0, 1.650D0, 1.640D0, 1.630D0, 1.620D0, 1.850D0, 1.610D0, C Bk Cf Es Fm Md No Lr + 1.590D0, 1.590D0, 1.580D0, 1.570D0, 1.560D0, 1.560D0, 1.560D0, + 2 * 0.000D0/ C E N D --------- -- _____________________________________ Dr. Norman Goldberg (N.Goldberg -8 at 8- tu-bs.de) Technische Universitaet Braunschweig Institut fuer Organische Chemie Hagenring 30 D-38106 Braunschweig (FRG) Tel.: +(0)531-391-5312 Fax : +(0)531-391-5388 http://www.tu-bs.de/institute/org-chem/goldberg/WELCOME.htm