[BioC] removeBatchEffect problem
James W. MacDonald
jmacdon at med.umich.edu
Thu Oct 20 17:32:30 CEST 2011
Hi John,
I don't know what to tell you.
> x
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 21.34228 18.43027 21.29862 15.48104 23.16786 20.83750
[2,] 27.00744 23.49814 27.41335 23.03560 24.70404 22.56948
[3,] 27.02660 24.11282 26.70383 23.81685 22.84880 23.33938
[4,] 21.71527 21.39641 21.07932 20.88411 20.31454 20.50233
[5,] 32.57084 30.33254 31.80184 30.28329 31.11213 30.74922
[6,] 22.51718 22.04644 22.17471 21.84274 20.94986 21.76465
> design <- model.matrix(~factor(c(1,2,1,2,3,4)))
> removeBatchEffect(x, rep(c(1,2,1), each=2), design=design)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 20.59406 17.68205 22.04684 16.22926 23.16786 20.83750
[2,] 26.99328 23.48398 27.42751 23.04976 24.70404 22.56948
[3,] 26.87191 23.95813 26.85852 23.97153 22.84880 23.33938
[4,] 21.42821 21.10935 21.36638 21.17117 20.31454 20.50233
[5,] 32.36628 30.12798 32.00640 30.48785 31.11213 30.74922
[6,] 22.38064 21.90990 22.31125 21.97928 20.94986 21.76465
Works for me with the small subset of data you supply.
Best,
Jim
On 10/20/2011 11:02 AM, John Coulthard wrote:
> Thanks Jim, but still regardless of the design matrix I give it the output is the same?! I must be missing something else.
>
>> design<- model.matrix(~factor(c(1,2,1,2,3,4,3,4,5,6,5,6,7,8,7,8) ))
>> head(removeBatchEffect(e.qn,c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2),design=design)[,1:6])
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] 21.34228 18.43027 21.29862 15.48104 23.16786 20.83750
> [2,] 27.00744 23.49814 27.41335 23.03560 24.70404 22.56948
> [3,] 27.02660 24.11282 26.70383 23.81685 22.84880 23.33938
> [4,] 21.71527 21.39641 21.07932 20.88411 20.31454 20.50233
> [5,] 32.57084 30.33254 31.80184 30.28329 31.11213 30.74922
> [6,] 22.51718 22.04644 22.17471 21.84274 20.94986 21.76465
>> head(removeBatchEffect(e.qn,c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2))[,1:6])
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] 21.34228 18.43027 21.29862 15.48104 23.16786 20.83750
> [2,] 27.00744 23.49814 27.41335 23.03560 24.70404 22.56948
> [3,] 27.02660 24.11282 26.70383 23.81685 22.84880 23.33938
> [4,] 21.71527 21.39641 21.07932 20.88411 20.31454 20.50233
> [5,] 32.57084 30.33254 31.80184 30.28329 31.11213 30.74922
> [6,] 22.51718 22.04644 22.17471 21.84274 20.94986 21.76465
>> design<- model.matrix(~factor(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2) ))
>> head(removeBatchEffect(e.qn,c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2),design=design)[,1:6])
> [,1] [,2] [,3] [,4] [,5] [,6]
> [1,] 21.34228 18.43027 21.29862 15.48104 23.16786 20.83750
> [2,] 27.00744 23.49814 27.41335 23.03560 24.70404 22.56948
> [3,] 27.02660 24.11282 26.70383 23.81685 22.84880 23.33938
> [4,] 21.71527 21.39641 21.07932 20.88411 20.31454 20.50233
> [5,] 32.57084 30.33254 31.80184 30.28329 31.11213 30.74922
> [6,] 22.51718 22.04644 22.17471 21.84274 20.94986 21.76465
>
>
>
>> Date: Thu, 20 Oct 2011 09:33:01 -0400
>> From: jmacdon at med.umich.edu
>> To: bahhab at hotmail.com
>> CC: bioconductor at r-project.org
>> Subject: Re: [BioC] removeBatchEffect problem
>>
>> Hi John,
>>
>> On 10/20/2011 6:42 AM, John Coulthard wrote:
>>> Hi
>>>
>>> I'm having troubles with the removeBatchEffect function. It seems that whatever I put in as the 'design' I get the same result out of removeBatchEffect(); Perhaps I'm not using the right syntax for the 'design'?
>>>
>>> Below is each line of the function run individually and the output from the line using 'design' with various inputs; Always the same output?!
>>>
>>> My design is 4x2x2; 4 tissues(T) x 2 subgroups(S) x 2 days(D) and the col order in e.qn is...
>>> T1S1D1, T1S2D1, T1S1D2, T1S2D2, T2S1D1, T2S2D1, T2S1D2, T2S2D2, T3S1D1, T3S2D1, T3S1D2, T3S2D2, T4S1D1, T4S2D1, T4S1D2, T4S2D2,
>>>
>>> And it's the day(D) effect I'm trying to remove.
>>>
>>>
>>> Can anyone point me to the right way to run this.
>>>
>>> Many thanks
>>> John
>>>
>>> the command I started with...
>>> e.qn_batch_removed<-removeBatchEffect(e.qn,c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2),design=matrix(c(1,2,1,2,3,4,3,4,5,6,5,6,7,8,7,8),16,1))
>> That isn't the correct design matrix. What you most likely want is
>>
>> design<- model.matrix(~factor(c(1,2,1,2,3,4,3,4,5,6,5,6,7,8,7,8) ))
>>
>> Best,
>>
>> Jim
>>
>>
>>> then line by line...
>>>> removeBatchEffect
>>> function (x, batch, design = matrix(1, ncol(x), 1))
>>> {
>>> x<- as.matrix(x)
>>> batch<- as.factor(batch)
>>> contrasts(batch)<- contr.sum(levels(batch))
>>> X<- model.matrix(~batch)[, -1, drop = FALSE]
>>> X<- qr.resid(qr(design), X)
>>> qrX<- qr(X)
>>> t(qr.resid(qrX, t(x)))
>>> }
>>> <environment: namespace:limma>
>>>
>>>
>>>> x<- as.matrix(e.qn)
>>>> batch<- as.factor(c(1,1,2,2,1,1,2,2,1,1,2,2,1,1,2,2))
>>>> contrasts(batch)<- contr.sum(levels(batch))
>>>> X<- model.matrix(~batch)[, -1, drop = FALSE]
>>>> qr.resid(qr(matrix(c(1,2,1,2,3,4,3,4,5,6,5,6,7,8,7,8),16,1)), X)
>>> batch1
>>> 1 1
>>> 2 1
>>> 3 -1
>>> 4 -1
>>> 5 1
>>> 6 1
>>> 7 -1
>>> 8 -1
>>> 9 1
>>> 10 1
>>> 11 -1
>>> 12 -1
>>> 13 1
>>> 14 1
>>> 15 -1
>>> 16 -1
>>>> qr.resid(qr(matrix(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2),16,1)), X)
>>> batch1
>>> 1 1
>>> 2 1
>>> 3 -1
>>> 4 -1
>>> 5 1
>>> 6 1
>>> 7 -1
>>> 8 -1
>>> 9 1
>>> 10 1
>>> 11 -1
>>> 12 -1
>>> 13 1
>>> 14 1
>>> 15 -1
>>> 16 -1
>>>> qr.resid(qr(matrix(1,16,1)), X)
>>> batch1
>>> 1 1
>>> 2 1
>>> 3 -1
>>> 4 -1
>>> 5 1
>>> 6 1
>>> 7 -1
>>> 8 -1
>>> 9 1
>>> 10 1
>>> 11 -1
>>> 12 -1
>>> 13 1
>>> 14 1
>>> 15 -1
>>> 16 -1
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list