I am a student in the last year of the pharamaceutical chemistry at
the Univ. of Bari (Italy). As a project for my degree, I need to
analise a qsar data set (~130 compounds) in which pKi data are
unevenly distributed. Specifically, pKi data, which span from 7 to
12, are binned as follows
Given this distribution, my point is that there can be some risk to
find out unreliable q2 values during the leave one out calculation
and also that some validation analyses as the scrambling of data
can't help in a situation like that.
I wonder if I can still be happy performing qsar on the whole data
set ( which is what I whish, possibly ) or if it would be better select
a more uniform subset of points.
Have you any clue about that?