[R] BH correction with p.adjust
peter dalgaard
pdalgd at gmail.com
Sun Jul 21 18:01:10 CEST 2013
On Jul 21, 2013, at 15:02 , Scott Robinson wrote:
> My understanding was that the vector was ranked, the adjusted p vector was calculated and then the vector is returned to the original order - I came across a stack overflow answer saying this:
>
> http://stackoverflow.com/questions/10323817/r-unexpected-results-from-p-adjust-fdr
>
> Although the code there does not appear to be the same as when I type "p.adjust" into the command line. The order shouldn't matter anyway since my data was ordered by p.
>
> Yesterday I tried a short example of 5 numbers and it seemed to work out though today I tried to do another short example to demonstrate that the order in the p vector you input doesn't matter but didn't quite get a working example this time. Maybe due to a rounding to first significant figure or something?
>
>> smallP <- c(0.01, 0.5, 0.0001)
>> names(smallP) <- c("first", "second", "last")
>>
>> p.adjust(smallP)
> first second last
> 2e-02 5e-01 3e-04
>>
>> 0.01*3/2
> [1] 0.015
>> 0.5*3/3
> [1] 0.5
>> 0.0001*3/1
> [1] 3e-04
Comparing the same method might help!
> p.adjust(smallP, method="BH")
[1] 0.0150 0.5000 0.0003
>
> In any case I reconstructed a large example which can be run without real data where the figure is way off and definitely not the result of a rounding error:
>
>> exampleP <- seq(from=0.0000001, to=0.1, by=0.00000001)
>> length(exampleP)
> [1] 9999991
>>
>> examplePBH <- p.adjust(exampleP, method="BH")
>>
>> exampleP[1]
> [1] 1e-07
>>
>> examplePBH[1]
> [1] 0.1
>>
>> exampleP[1]*length(exampleP)/1
> [1] 0.9999991
>
> Any help with this would be very much appreciated. It seems like it ought to be such a simple and commonly used method and yet I am struggling and not sure what to do about it.
You have the source code, how about reading it?
}, BH = {
i <- lp:1L
o <- order(p, decreasing = TRUE)
ro <- order(o)
pmin(1, cummin(n/i * p[o]))[ro]
}
Notice the cumulative minimum. The first element in n/i*p[o] is going to be p[n] == 0.1 (since your p is in ascending order). So no element of the result is going to be bigger than 0.1. (I presume this is because p-adjustments must be order-preserving.)
> Thanks,
>
> Scott
>
> ________________________________________
> From: David Winsemius [dwinsemius at comcast.net]
> Sent: 21 July 2013 03:33
> To: Scott Robinson
> Cc: r-help at r-project.org
> Subject: Re: [R] BH correction with p.adjust
>
> On Jul 20, 2013, at 10:37 AM, Scott Robinson wrote:
>
>> Dear List,
>>
>> I have been trying to use p.adjust() to do BH multiple test correction and have gotten some unexpected results. I thought that the equation for this was:
>>
>> pBH = p*n/i
>
> Looking at the code for `p.adjust`, you see that the method is picked from a switch function
>
> lp <- length(p)
> BH = {
> i <- lp:1L
> o <- order(p, decreasing = TRUE)
> ro <- order(o)
> pmin(1, cummin(n/i * p[o]))[ro]
> }
>
> You may not have sorted the p-values in pList.
>
>
>>
>> where p is the original p value, n is the number of tests and i is the rank of the p value. However when I try and recreate the corrected p from my most significant value it does not match up to the one computed by the method p.adjust:
>>
>>> setwd("C:/work/Methylation/IMA/GM/siteLists")
>>>
>>> hypTable <- read.delim("hypernormal vs others.txt")
>>> pList <- hypTable$p
>>> names(pList) <- hypTable$site
>>>
>>> adjusted <- p.adjust(pList, method="BH")
>>> adjusted[1]
>> cg27433479
>> 0.05030589
>>>
>>> pList[1]*nrow(hypTable)/1
>> cg27433479
>> 0.09269194
>>
>
> No data provided, so unable to pursue this further.
>
>> I tried to recreate this is a small example of a vector of 5 p values but everything worked as expected there. I was wondering if there is some subtle difference about how p.adjust operates? Is there something more complicated about how to calculate 'n' or 'i' - perhaps due to identical p values being assigned the same rank or something? Does anyone have an idea what might be going on here?
>
>
> --
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list