My understanding was that the vector was ranked, the adjusted p vector was calculated and then the vector is returned to the original order - I came across a stack overflow answer saying this:
http://stackoverflow.com/questions/10323817/r-unexpected-results-from-p-adjust-fdr Although the code there does not appear to be the same as when I type "p.adjust" into the command line. The order shouldn't matter anyway since my data was ordered by p. Yesterday I tried a short example of 5 numbers and it seemed to work out though today I tried to do another short example to demonstrate that the order in the p vector you input doesn't matter but didn't quite get a working example this time. Maybe due to a rounding to first significant figure or something? > smallP <- c(0.01, 0.5, 0.0001) > names(smallP) <- c("first", "second", "last") > > p.adjust(smallP) first second last 2e-02 5e-01 3e-04 > > 0.01*3/2 [1] 0.015 > 0.5*3/3 [1] 0.5 > 0.0001*3/1 [1] 3e-04 In any case I reconstructed a large example which can be run without real data where the figure is way off and definitely not the result of a rounding error: > exampleP <- seq(from=0.0000001, to=0.1, by=0.00000001) > length(exampleP) [1] 9999991 > > examplePBH <- p.adjust(exampleP, method="BH") > > exampleP[1] [1] 1e-07 > > examplePBH[1] [1] 0.1 > > exampleP[1]*length(exampleP)/1 [1] 0.9999991 Any help with this would be very much appreciated. It seems like it ought to be such a simple and commonly used method and yet I am struggling and not sure what to do about it. Thanks, Scott ________________________________________ From: David Winsemius [dwinsem...@comcast.net] Sent: 21 July 2013 03:33 To: Scott Robinson Cc: r-help@r-project.org Subject: Re: [R] BH correction with p.adjust On Jul 20, 2013, at 10:37 AM, Scott Robinson wrote: > Dear List, > > I have been trying to use p.adjust() to do BH multiple test correction and > have gotten some unexpected results. I thought that the equation for this was: > > pBH = p*n/i Looking at the code for `p.adjust`, you see that the method is picked from a switch function lp <- length(p) BH = { i <- lp:1L o <- order(p, decreasing = TRUE) ro <- order(o) pmin(1, cummin(n/i * p[o]))[ro] } You may not have sorted the p-values in pList. > > where p is the original p value, n is the number of tests and i is the rank > of the p value. However when I try and recreate the corrected p from my most > significant value it does not match up to the one computed by the method > p.adjust: > >> setwd("C:/work/Methylation/IMA/GM/siteLists") >> >> hypTable <- read.delim("hypernormal vs others.txt") >> pList <- hypTable$p >> names(pList) <- hypTable$site >> >> adjusted <- p.adjust(pList, method="BH") >> adjusted[1] > cg27433479 > 0.05030589 >> >> pList[1]*nrow(hypTable)/1 > cg27433479 > 0.09269194 > No data provided, so unable to pursue this further. > I tried to recreate this is a small example of a vector of 5 p values but > everything worked as expected there. I was wondering if there is some subtle > difference about how p.adjust operates? Is there something more complicated > about how to calculate 'n' or 'i' - perhaps due to identical p values being > assigned the same rank or something? Does anyone have an idea what might be > going on here? -- David Winsemius Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.