RE: when to take a logarithm of biological activity in QSAR studi es?



There are other reasons why the logarithm is often used in QSAR models:
 1.  Typically, the quantity being modeled can only be positive.  Using the
 logarithm ensures this.
 2.  Many QSAR correlations are derived from theory.  For example,
 correlations for activity coefficients are based upon the excess Gibbs Free
 Energy; the logarithm of the activity coefficient of a particular component
 in a mixture is related to its partial molar derivative of the mixture
 excess Gibbs Free Energy.  Models for the excess Gibbs Free Energy (e.g.,
 Hildebrand's Regular-Solution Theory and UNIQUAC) invariably result in a
 mathematical expression for the logarithm of the activity coefficient as a
 function of composition, temperature and physical quantities representing
 the intermolecular interaction.
 It's best to have a model for the target property of interest.  If the model
 is good, an accurate QSAR model will be obtained with only a few adjustable
 parameters.  And, for quantities like activities, the resulting QSAR
 relationship will usually be in the form of a logarithm.
 I have a related quote in my files from somebody named D. L. Bunker (I don't
 know who he is):  "Physical Chemistry is research on everything for which
 the negative logarithm is linear with 1/T."
 Paul Mathias
 -----Original Message-----
 From: Stephen Bowlus [mailto:chezbowlus.-at-.goldrush.com]
 Sent: Monday, January 05, 2004 1:16 PM
 To: CHEMISTRY.-at-.ccl.net
 Subject: CCL:when to take a logarithm of biological activity in QSAR
 studies?
 It certainly makes sense to express the effective concentration as a
 logarithm, for the reasons given by several respondents to this query.
 It must be appreciated, however, that there is an ASSUMPTION here that
 binding affinity is the activity-limiting feature of a given ligand and
 especially across a series of ligands, if one is extracting a
 structure-activity relationship.  This assumption is frequently
 violated if one is using whole-organism data as the endpoint for
 activity. The distribution characteristics of the logarithm are also
 statistics-friendly when extracting the model (I assume one is using
 some sort of regression here).
 Use of %-inhibition (at  a given concentration) is a common practice
 noted by J Panek, particularly for whole-organism response.  Just as
 use of the log is good in the above case, there are other
 transformations of this data which give (statistically) better
 distribution, and these should be investigated when developing an SAR
 equation.  The use of the sqrt(arcsin(X)) comes to mind.
 Historically, I think use of the logarithm sort of (uncritically) grew
 out of its use in "Hansch-like" QSAR studies, which share theoretical
 underpinnings with the classic Hammett relationship (and other
 free-energy relationships).  In medicinal chemistry, the use of -logX
 arose out of a desire that "bigger numbers are better."
 sb
 -= This is automatically added to each message by the mailing script =-
 To send e-mail to subscribers of CCL put the string CCL: on your Subject:
 line
 and send your message to:  CHEMISTRY.-at-.ccl.net
 Send your subscription/unsubscription requests to:
 CHEMISTRY-REQUEST.-at-.ccl.net
 HOME Page: http://www.ccl.net
 | Jobs Page: http://www.ccl.net/jobs
 If your mail is bouncing from CCL.NET domain send it to the maintainer:
 Jan Labanowski,  jkl.-at-.ccl.net (read about it on CCL Home Page)
 -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+