From chemistry-request@server.ccl.net Fri May 3 14:42:12 2002 Received: from dedalus.lcc.ufmg.br ([150.164.65.10]) by server.ccl.net (8.11.6/8.11.0) with ESMTP id g43IgBI28771 for ; Fri, 3 May 2002 14:42:12 -0400 Received: from p333 (lcc-ip11.lcc.ufmg.br [150.164.65.210]) by dedalus.lcc.ufmg.br (8.9.3/8.9.3) with SMTP id PAA53420; Fri, 3 May 2002 15:41:48 -0300 Message-ID: <001301bd165b$803da300$d241a496@p333> From: "=?iso-8859-1?Q?Andr=E9_M._de_Oliveira?=" To: "Angelo Favia" , References: <001801c1f271$42f0ad20$c6b3ccc1@farmo2> Subject: Re: CCL:analysis of qsar data set. Date: Thu, 1 Jan 1998 00:17:58 -0200 Organization: UFOP MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0010_01BD164A.B6A68800" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2314.1300 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300 This is a multi-part message in MIME format. ------=_NextPart_000_0010_01BD164A.B6A68800 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Dear Mr. Favia, Try splitting your dataset into two or more subsets, performing the = analysis in separate. Greets. ----- Original Message -----=20 From: Angelo Favia=20 To: chemistry@ccl.net=20 Sent: Friday, May 03, 2002 5:08 AM Subject: CCL:analysis of qsar data set. Dear netters, I am a student in the last year of the pharamaceutical chemistry at=20 the Univ. of Bari (Italy). As a project for my degree, I need to=20 analise a qsar data set (~130 compounds) in which pKi data are=20 unevenly distributed. Specifically, pKi data, which span from 7 to=20 12, are binned as follows 7-8: ~5% 8-9: ~16% 9-10: ~57% 10-11: ~26% 11-12: ~2% Given this distribution, my point is that there can be some risk to=20 find out unreliable q2 values during the leave one out calculation=20 and also that some validation analyses as the scrambling of data=20 can't help in a situation like that. I wonder if I can still be happy performing qsar on the whole data=20 set ( which is what I whish, possibly ) or if it would be better = select=20 a more uniform subset of points. Have you any clue about that? thanks, Angelo Favia ------=_NextPart_000_0010_01BD164A.B6A68800 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Dear Mr. Favia,
 
Try splitting your dataset into two or = more=20 subsets, performing the analysis in separate.
 
Greets.
----- Original Message -----
From:=20 Angelo=20 Favia
Sent: Friday, May 03, 2002 5:08 = AM
Subject: CCL:analysis of qsar = data=20 set.

Dear netters,
I am a student in = the last year=20 of the pharamaceutical chemistry at
the Univ. of Bari (Italy). As = a=20 project for my degree, I need to
analise a qsar data set (~130 = compounds)=20 in which pKi data are
unevenly distributed. Specifically, pKi = data, which=20 span from 7 to
12,  are binned as follows
7-8: ~5%
8-9: = ~16%
9-10: ~57%
10-11: ~26%
11-12: ~2%
Given this = distribution, my=20 point is that there can be some risk to
find out unreliable q2 = values=20 during the leave one out calculation
and also that some validation = analyses as the scrambling of data
can't help in a situation like=20 that.
I wonder if I can still be happy performing qsar on the whole = data=20
set ( which is what I whish, possibly ) or if it would be better = select=20
a more uniform subset of points.

Have you any clue about=20 that?

thanks,
Angelo=20 Favia

------=_NextPart_000_0010_01BD164A.B6A68800--