On Feb 13, 2008 3:06 PM, Gustaf Rydevik <[EMAIL PROTECTED]> wrote: > On Feb 13, 2008 2:37 PM, Matthias Gondan <[EMAIL PROTECTED]> wrote: > > Hi Eleni, > > > > The problem of this approach is easily explained: Under the Null > > hypothesis, the P values > > of a significance test are random variables, uniformly distributed in > > the interval [0, 1]. It > > is easily seen that the lowest of these P values is not any 'better' > > than the highest of the > > P values. > > > > Best wishes, > > > > Matthias > > > > Correct me if I'm wrong, but isn't that the point? I assume that the > hypothesis is that one or more of these genes are true predictors, > i.e. for these genes the p-value should be significant. For all the > other genes, the p-value is uniformly distributed. Using a > significance level of 0.01, and an a priori knowledge that there are > significant genes, you will end up with on the order of 20 genes, some > of which are the "true" predictors, and the rest being false > positives. this set of 20 genes can then be further analysed. A much > smaller and easier problem to solve, no? > > > /Gustaf
Sorry, it should say 200 genes instead of 20. -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.