CCL: Criteria for selecting if ligand activities and structures are enough
diverse for building a 3D-QSAR model
From: Matthew Clark <mclark(0)pharmatrope.com>
Subject: CCL: Criteria for selecting if ligand activities and
structures are enough diverse for building a 3D-QSAR model
Date: Sun, 10 Jan 2010 21:22:33 -0500
While one to two log units may be the minimum, it is difficult to make
predictive models with low variance of the dependent variable. The r2 is
related to both the error of prediction and not the range, but the *standard
deviation* of the dependent variable in the training set. So it is
important that not only the range be good, but that the standard deviation is
also large for the dependent variable. That is, that they are evenly distributed
across the range.