*From*: "Mathias, Paul" <Paul.Mathias.-at-.aspentech.com>*Subject*: RE: when to take a logarithm of biological activity in QSAR studi es?*Date*: Mon, 5 Jan 2004 15:20:39 -0500

There are other reasons why the logarithm is often used in QSAR models: 1. Typically, the quantity being modeled can only be positive. Using the logarithm ensures this. 2. Many QSAR correlations are derived from theory. For example, correlations for activity coefficients are based upon the excess Gibbs Free Energy; the logarithm of the activity coefficient of a particular component in a mixture is related to its partial molar derivative of the mixture excess Gibbs Free Energy. Models for the excess Gibbs Free Energy (e.g., Hildebrand's Regular-Solution Theory and UNIQUAC) invariably result in a mathematical expression for the logarithm of the activity coefficient as a function of composition, temperature and physical quantities representing the intermolecular interaction. It's best to have a model for the target property of interest. If the model is good, an accurate QSAR model will be obtained with only a few adjustable parameters. And, for quantities like activities, the resulting QSAR relationship will usually be in the form of a logarithm. I have a related quote in my files from somebody named D. L. Bunker (I don't know who he is): "Physical Chemistry is research on everything for which the negative logarithm is linear with 1/T." Paul Mathias -----Original Message----- From: Stephen Bowlus [mailto:chezbowlus.-at-.goldrush.com] Sent: Monday, January 05, 2004 1:16 PM To: CHEMISTRY.-at-.ccl.net Subject: CCL:when to take a logarithm of biological activity in QSAR studies? It certainly makes sense to express the effective concentration as a logarithm, for the reasons given by several respondents to this query. It must be appreciated, however, that there is an ASSUMPTION here that binding affinity is the activity-limiting feature of a given ligand and especially across a series of ligands, if one is extracting a structure-activity relationship. This assumption is frequently violated if one is using whole-organism data as the endpoint for activity. The distribution characteristics of the logarithm are also statistics-friendly when extracting the model (I assume one is using some sort of regression here). Use of %-inhibition (at a given concentration) is a common practice noted by J Panek, particularly for whole-organism response. Just as use of the log is good in the above case, there are other transformations of this data which give (statistically) better distribution, and these should be investigated when developing an SAR equation. The use of the sqrt(arcsin(X)) comes to mind. Historically, I think use of the logarithm sort of (uncritically) grew out of its use in "Hansch-like" QSAR studies, which share theoretical underpinnings with the classic Hammett relationship (and other free-energy relationships). In medicinal chemistry, the use of -logX arose out of a desire that "bigger numbers are better." sb -= This is automatically added to each message by the mailing script =- To send e-mail to subscribers of CCL put the string CCL: on your Subject: line and send your message to: CHEMISTRY.-at-.ccl.net Send your subscription/unsubscription requests to: CHEMISTRY-REQUEST.-at-.ccl.net HOME Page: http://www.ccl.net | Jobs Page: http://www.ccl.net/jobs If your mail is bouncing from CCL.NET domain send it to the maintainer: Jan Labanowski, jkl.-at-.ccl.net (read about it on CCL Home Page) -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+