RE: when to take a logarithm of biological activity in QSAR studi es?
- From: "Mathias, Paul"
- Subject: RE: when to take a logarithm of biological activity in
QSAR studi es?
- Date: Mon, 5 Jan 2004 15:20:39 -0500
There are other reasons why the logarithm is often used in QSAR models:
1. Typically, the quantity being modeled can only be positive. Using the
logarithm ensures this.
2. Many QSAR correlations are derived from theory. For example,
correlations for activity coefficients are based upon the excess Gibbs Free
Energy; the logarithm of the activity coefficient of a particular component
in a mixture is related to its partial molar derivative of the mixture
excess Gibbs Free Energy. Models for the excess Gibbs Free Energy (e.g.,
Hildebrand's Regular-Solution Theory and UNIQUAC) invariably result in a
mathematical expression for the logarithm of the activity coefficient as a
function of composition, temperature and physical quantities representing
the intermolecular interaction.
It's best to have a model for the target property of interest. If the model
is good, an accurate QSAR model will be obtained with only a few adjustable
parameters. And, for quantities like activities, the resulting QSAR
relationship will usually be in the form of a logarithm.
I have a related quote in my files from somebody named D. L. Bunker (I don't
know who he is): "Physical Chemistry is research on everything for which
the negative logarithm is linear with 1/T."
From: Stephen Bowlus [mailto:chezbowlus.-at-.goldrush.com]
Sent: Monday, January 05, 2004 1:16 PM
Subject: CCL:when to take a logarithm of biological activity in QSAR
It certainly makes sense to express the effective concentration as a
logarithm, for the reasons given by several respondents to this query.
It must be appreciated, however, that there is an ASSUMPTION here that
binding affinity is the activity-limiting feature of a given ligand and
especially across a series of ligands, if one is extracting a
structure-activity relationship. This assumption is frequently
violated if one is using whole-organism data as the endpoint for
activity. The distribution characteristics of the logarithm are also
statistics-friendly when extracting the model (I assume one is using
some sort of regression here).
Use of %-inhibition (at a given concentration) is a common practice
noted by J Panek, particularly for whole-organism response. Just as
use of the log is good in the above case, there are other
transformations of this data which give (statistically) better
distribution, and these should be investigated when developing an SAR
equation. The use of the sqrt(arcsin(X)) comes to mind.
Historically, I think use of the logarithm sort of (uncritically) grew
out of its use in "Hansch-like" QSAR studies, which share theoretical
underpinnings with the classic Hammett relationship (and other
free-energy relationships). In medicinal chemistry, the use of -logX
arose out of a desire that "bigger numbers are better."
-= This is automatically added to each message by the mailing script =-
To send e-mail to subscribers of CCL put the string CCL: on your Subject:
and send your message to: CHEMISTRY.-at-.ccl.net
Send your subscription/unsubscription requests to:
HOME Page: http://www.ccl.net
| Jobs Page: http://www.ccl.net/jobs
If your mail is bouncing from CCL.NET domain send it to the maintainer:
Jan Labanowski, jkl.-at-.ccl.net (read about it on CCL Home Page)