CCL: Criteria for selecting if ligand activities and structures are enough diverse for building a 3D-QSAR model




While one to two log units may be the minimum, it is difficult to make predictive models with low variance of the dependent variable.  The r2 is related to both the error of prediction and not the range, but the *standard deviation* of the dependent variable in the training set.  So it is important that not only the range be good, but that the standard deviation is also large for the dependent variable. That is, that they are evenly distributed across the range.

Matthew Clark






Matthew Clark, Ph. D.
CIO Pharmatrope Ltd
610 772 4652
mclark[-]pharmatrope.com