[BioC] loop design matrix

Gordon Smyth smyth at wehi.edu.au
Fri Jan 13 01:55:19 CET 2006


At 11:00 AM 13/01/2006, Matthew Scholz wrote:
>Gordon,
>
>Thank you for your thoughtful reply. I apologize for not having 
>provided a more thorough accounting of my R session. To save space, 
>I have left out the output of the initial steps (reading in the 
>files, kooperberg correcting them, and quantile normalizing), but 
>would be happy to provide them if you think they would be of use. I 
>have, nonetheless, included the code I inputted to generate the data 
>object RGAn that I pass to the lmFit function.

[snip]

> > designmatrix <- modelMatrix(targetsA, ref="RC")
>Found unique target names:
>  A B C D E RC
> > designmatrix
>                      A  B  C  D  E
>2005-11-18_A130high -1  1  0  0  0
>2005-11-16_A121high  0 -1  1  0  0
>2005-11-15_A117high  0  0 -1  1  0
>2005-11-16_A119high  0  0  0 -1  1
>2005-11-17_A127high  0  0  0  0 -1
>2005-11-17_A128high  1  0  0  0  0
>2005-11-17_A129high -1  0  1  0  0
>2005-11-18_A131high  0 -1  0  1  0
>2005-11-16_A120high  0  0 -1  0  1
>2005-11-15_A118high  0  0  0 -1  0
>2005-11-22_A137high  1  0  0  0 -1
>2005-11-18_A136high  0  1  0  0  0
> > designfitA <- lmFit(RGAn, design=designmatrix, method="ls")
> > designfitA <- eBayes(designfitA)
>
>#And here's a look at a subset of the data object with the warning 
>that I am asking about
>
> > designfitA[2,1]
>An object of class "MArrayLM"
>$coefficients
>               A
>[1,] 0.07587446
>
>$stdev.unscaled
>              A
>[1,] 0.7007649
>
>$sigma
>[1] 2.338142
>
>$df.residual
>[1] 6
>
>$cov.coefficients
>           [,1]
>[1,] 0.4166667
>
>$pivot
>[1] 1 2 3 4 5
>
>$method
>[1] "ls"
>
>$design
>                      A B C D E
>2005-11-18_A130high -1 1 0 0 0
>
>$genes
>   Block Row Column         ID     Name
>2     1   1      2 MZ00040730 TC245883
>
>$Amean
>[1] 6.35853
>
>$df.prior
>[1] 3.102594
>
>$s2.prior
>[1] 0.5266471
>
>$var.prior
>[1] 0.5289733
>
>$proportion
>[1] 0.01
>
>$s2.post
>[1] 3.783035
>
>$t
>               A
>[1,] 0.05566768
>
>$p.value
>              A
>[1,] 0.9568093
>
>$lods
>              A
>[1,] -4.959734
>
>$F
>[1] 0.3347732
>
>$F.p.value
>[1] 0.879682
>
>Warning message:
>design matrix is singular
>
>Thanks again,
>
>Matt

Thanks for providing the detailed session. This clarifies things.

The warning message is not actually contained in the fitted model 
object. If you type

   designfitA

at the R prompt, you won't see any warning message. The warning 
message is generated and written to the screen by the subsetting 
operation. If you typed

   fitsubset <- designfitA[2,1]

you would see the warning while if you type

   fitsubset

you won't. What you're seeing in the output from your session is the 
contents of fitsubset followed by the warning message. The warning 
message may appear to be part of the object just because it follows 
immediately in the output.

The subset designfitA[2,1] means the model fit for just the second 
gene and the first coefficient. (Why do you want this?) The design 
matrix is stored as part of the fitted model object, and the 
subsetting operation changes it as well. The warning message is 
intended to warn you that the subsetted design matrix stored in 
fitsubset should not be used as input for any subsequent linear model 
fits. I agree that the warning is unexpected in this situation, and I 
am going to change it. (In fact I'm going to change the way in which 
subsetting affects the design matrix.)

The thing to remember is that your linear model fit was perfectly 
correct and you should be confident about using it. The warning 
message occured when, later on, an operation was used which changed 
the copy of the design matrix which was stored in the fitted model 
object. The warning doesn't relate to the design matrix used to fit 
your linear model. Rather it refers to a different and smaller design 
matrix produced when you subsetted the fitted model object.

Best wishes
Gordon



More information about the Bioconductor mailing list