From chemistry-request #at# server.ccl.net Fri May 3 03:08:09 2002 Received: from gauss.uniba.it (IDENT:root#* at *#[193.204.176.40]) by server.ccl.net (8.11.6/8.11.0) with ESMTP id g43788I30333 for ; Fri, 3 May 2002 03:08:09 -0400 Received: from mozart.uniba.it (IDENT:root-!at!-mozart.uniba.it [193.204.184.2]) by gauss.uniba.it (8.9.3/8.9.3) with ESMTP id JAA01704 for ; Fri, 3 May 2002 09:01:53 +0200 Received: from farmo2 (farmo2.farmacia.uniba.it [193.204.179.198]) by mozart.uniba.it (8.11.0/8.11.0) with SMTP id g437GbY30554 for ; Fri, 3 May 2002 09:16:37 +0200 Message-ID: <001801c1f271$42f0ad20$c6b3ccc1 -8 at 8- farmo2> From: "Angelo Favia" To: Subject: analysis of qsar data set. Date: Fri, 3 May 2002 09:08:00 +0200 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0015_01C1F282.06666A50" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2600.0000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 This is a multi-part message in MIME format. ------=_NextPart_000_0015_01C1F282.06666A50 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear netters, I am a student in the last year of the pharamaceutical chemistry at=20 the Univ. of Bari (Italy). As a project for my degree, I need to=20 analise a qsar data set (~130 compounds) in which pKi data are=20 unevenly distributed. Specifically, pKi data, which span from 7 to=20 12, are binned as follows 7-8: ~5% 8-9: ~16% 9-10: ~57% 10-11: ~26% 11-12: ~2% Given this distribution, my point is that there can be some risk to=20 find out unreliable q2 values during the leave one out calculation=20 and also that some validation analyses as the scrambling of data=20 can't help in a situation like that. I wonder if I can still be happy performing qsar on the whole data=20 set ( which is what I whish, possibly ) or if it would be better select=20 a more uniform subset of points. Have you any clue about that? thanks, Angelo Favia ------=_NextPart_000_0015_01C1F282.06666A50 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Dear netters,
I am = a student in=20 the last year of the pharamaceutical chemistry at
the Univ. of Bari = (Italy).=20 As a project for my degree, I need to
analise a qsar data set (~130=20 compounds) in which pKi data are
unevenly distributed. Specifically, = pKi=20 data, which span from 7 to
12,  are binned as follows
7-8:=20 ~5%
8-9: ~16%
9-10: ~57%
10-11: ~26%
11-12: ~2%
Given = this=20 distribution, my point is that there can be some risk to
find out = unreliable=20 q2 values during the leave one out calculation
and also that some = validation=20 analyses as the scrambling of data
can't help in a situation like = that.
I=20 wonder if I can still be happy performing qsar on the whole data
set = ( which=20 is what I whish, possibly ) or if it would be better select
a more = uniform=20 subset of points.

Have you any clue about = that?

thanks,
Angelo=20 Favia

------=_NextPart_000_0015_01C1F282.06666A50--