[BioC] Error in La.chol(t(A/s)/s) fitting limma models
Gordon Smyth
smyth at wehi.edu.au
Fri Jun 25 04:45:56 CEST 2004
Paul,
There are several things going on here, a couple of which are my fault.
1. The lmFit() function does handle singular design matrices, in the same
way that the lm() function in the base package does. Any coefficients which
are inestimable will simply be set to NA. So my earlier post
https://stat.ethz.ch/pipermail/bioconductor/2004-May/004833.html claiming
that singular matrices produce an error messages from lmFit() was not correct.
2. Prior to BioC release 1.4, the contrasts.fit() function was only
intended to handle orthogonal design matrices arising from one-way layouts
(as described in the help entry). From BioC release 1.4, contrasts.fit() is
intended to handle general design and contrast matrices. But I had
overlooked the need to allow for singular design matrices. Hence the error
message you have got. I will correct this because I would like
contrasts.fit() to handle without errors any MArrayLM object produced by
lmFit().
3. Your design matrix below is not correct. Notice that column 7 is equal
to the sum of columns 3 and 4, hence the singularity. You don't give
reproducible code to compute the design matrix, rather you have produced
the matrix by hand editing, and you don't give pData(eset), so we can't
tell you what the correct code should have been. Since you have 4 factors
in your experiment, it might well be worthwhile to use the formula
representations of factorial designs in R and the resulting
parametrizations. (I know these can be unintuitive for many people.)
4. You might not even need a contrast matrix if the coefficients of
interest to you are already in your linear model.
Gordon
At 11:36 AM 25/06/2004, paul.boutros at utoronto.ca wrote:
>Hello,
>
>I'm having some problems fitting a linear model:
>
>### BEGIN CODE ####################
># libraries
>library(gcrma);
>library(limma);
>
># Normalize via GCRMA
>eset <- justGCRMA();
>
># Save normalized data to disk
>write.exprs(eset, "gcrma.txt");
>
># look at phenotypic data
>pData(eset);
>
># create a dummy design matrix:
>design <- model.matrix(~ -1 + factor(c
>(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,5,5,5,5,5,5,6,6,6,6,6,6,6,6,7,8
>)));
>colnames(design) <- c
>("basal", "LnC", "HW", "LnA", "drug", "time", "resistance",
>"drug_resistance");
>
># then make it real manually
>design <- edit(design)
>
># make a contrast matrix
>contrast.matrix <- makeContrasts(drug, time, resistance, drug_resistance,
>levels=design);
>
># fit the model
>fit1 <- lmFit(eset, design);
>fit2 <- contrasts.fit(fit1, contrast.matrix);
>### END CODE ######################
>
>The error:
>Error in La.chol(t(A/s)/s) : the leading minor of order 4 is not positive
>definite
>
> > design;
> basal LnC HW LnA drug time resistance drug_resistance
>1 1 0 1 0 1 1 1 1
>2 1 0 1 0 1 1 1 1
>3 1 0 1 0 1 1 1 1
>4 1 0 1 0 1 1 1 1
>5 1 0 0 0 1 1 0 0
>6 1 0 0 0 1 1 0 0
>7 1 0 0 0 1 1 0 0
>8 1 0 0 0 1 1 0 0
>9 1 0 0 0 1 0 0 0
>10 1 0 0 0 0 0 0 0
>11 1 0 1 0 0 0 1 0
>12 1 0 1 0 1 0 1 1
>13 1 0 0 0 0 0 0 0
>14 1 0 0 0 0 0 0 0
>15 1 0 0 0 0 0 0 0
>16 1 0 1 0 0 0 1 0
>17 1 0 1 0 0 0 1 0
>18 1 0 0 0 1 0 0 0
>19 1 0 0 0 1 0 0 0
>20 1 0 0 0 1 0 0 0
>21 1 0 1 0 1 0 1 1
>22 1 0 1 0 1 0 1 1
>23 1 0 1 0 1 0 1 1
>24 1 0 1 0 0 0 1 0
>25 1 0 0 1 0 0 1 0
>26 1 0 0 1 0 0 1 0
>27 1 0 0 1 0 0 1 0
>28 1 0 0 1 0 0 1 0
>29 1 0 0 1 1 0 1 1
>30 1 0 0 1 1 0 1 1
>31 1 0 0 1 1 0 1 1
>32 1 1 0 0 0 0 0 0
>33 1 1 0 0 0 0 0 0
>34 1 1 0 0 0 0 0 0
>35 1 1 0 0 0 0 0 0
>36 1 1 0 0 1 0 0 0
>37 1 1 0 0 1 0 0 0
>38 1 1 0 0 1 0 0 0
>39 1 1 0 0 1 0 0 0
>40 1 0 0 1 1 0 1 1
>
> > contrast.matrix;
> drug time resistance drug_resistance
>basal 0 0 0 0
>LnC 0 0 0 0
>HW 0 0 0 0
>LnA 0 0 0 0
>drug 1 0 0 0
>time 0 1 0 0
>resistance 0 0 1 0
>drug_resistance 0 0 0 1
>
>This problem has come up previously:
>https://stat.ethz.ch/pipermail/bioconductor/2004-May/004802.html
>
>And Gordon responded that it was a design-matrix not of full-rank, and
>that an
>error should have been thrown by lmFit.
>https://stat.ethz.ch/pipermail/bioconductor/2004-May/004833.html
>
>So two things:
>1. That error doesn't seem to be thrown in my case: not sure if that's a
>feature or a bug, but it does go against what Gordon indicated the expected
>behaviour would be.
>
>2. When I examine my fitted object:
>
> > fit1;
>An object of class "MArrayLM"
>$coefficients
> basal LnC HW LnA drug time
>resistance drug_resistance
>[1,] 9.610458 -0.09169877 0.15457049 -0.14573919 0.02652003 -
>0.3049501 NA -0.10234624
>[2,] 9.187651 -0.12767969 0.08154156 0.05242885 0.14016087 -
>0.1239801 NA -0.17493971
>[3,] 9.153906 -0.05092235 -0.10669856 0.09775670 0.46374663 -
>0.3449803 NA -0.20873256
>[4,] 11.083109 -0.08410092 -0.04708751 -0.11681116 0.20109553 -
>0.3318239 NA -0.05279189
>[5,] 11.708261 -0.14465592 -0.03052111 -0.25919099 -0.11353194 -
>0.1448190 NA 0.05919881
>15918 more rows ...
>
>And it's apparent that resistance is not being fitted at all, indicating (I
>think) that my matrix isn't of full rank. I'm not catching how I
>mis-specified
>the matrix, though.
>
>The experiment has four factors:
>strain: four levels
>treatment: two levels
>time: two levels
>resistance: two levels
>
>I wasn't sure how to fit the four-level strain, so I fitted three factors as
>separate levels. I also explicitly fitted the drug-resistance interaction
>into
>the design-matrix. I'd guess the latter is the source of my problems, but
>any
>pointers on how I ought to have handled both these issues are very, very much
>appreciated.
>
>Paul
More information about the Bioconductor
mailing list