[BioC] Coefficients not estimable

Gordon K Smyth smyth at wehi.EDU.AU
Wed Feb 8 23:13:27 CET 2012


Dear Somnath,

You are asking questions about variances, rather than questions about 
means; and questions about differential variability rather than questions 
about differential expression.  You cannot easily answer these questions 
using limma, or I think any other Bioconductor package.  Have a look at 
the following paper for the sort of methods that are relevant for 
analysing variability:

   http://www.ncbi.nlm.nih.gov/pubmed/21655321

These are statistically advanced methods, and you will need to find a very 
experienced statistician that you can work with.

Best wishes
Gordon

---------------------------------------------
Professor Gordon K Smyth,
Bioinformatics Division,
Walter and Eliza Hall Institute of Medical Research,
1G Royal Parade, Parkville, Vic 3052, Australia.
Tel: (03) 9345 2326, Fax (03) 9347 0852,
smyth at wehi.edu.au
http://www.wehi.edu.au
http://www.statsci.org/smyth

On Wed, 8 Feb 2012, somnath bandyopadhyay wrote:

>
> Hi Gordon,
>
> I tried model.matrix(~gender+gender:donor+day). I am still running into the same error.
>
> In terms of what I want to get out of this analysis is as follows:
>
> 1. Does gene expression vary significantly across donors on the same 
> day? Is the variation higher in males than in females? What are the 
> least varying genes across donors?

> 2. Does gene expression vary significantly across days for a given 
> donor? Is the variation higher in males than in females? What are the 
> least varying genes across days?

> 3. What are the least varying genes across donors, days and gender all 
> taken together? (more like house keeping genes)
>
> Any help would be greatly appreciated.
>
> What I tried:
>
> R version 2.10.1 (2009-12-14)
> Copyright (C) 2009 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>  Natural language support but running in an English locale
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>> library("limma")
>> rna <- read.table('qced_non_redundant_rma_data.txt', header=T, row.names=1, sep= "\t")
>> annotation <- read.table('Test_Samples_Annotation.txt', header=T, row.names=1, sep= "\t")
>> dim(rna)
> [1] 54675    36
>> ## check that the annotations and data match
>> if (sum(sort(rownames(annotation)) == sort(colnames(rna))) != length(rownames(annotation))) {
> +   stop ("ERROR: annotation rownames and RMA colnames do not match")
> + }
>> donor <- factor(annotation$donor)
>> day <- factor(annotation$day)
>> gender <- factor(annotation$gender)
>> design <- model.matrix(~ donor + gender:donor + day, data=annotation)
>> fit <- lmFit(rna,design)
> Coefficients not estimable: donorF2:gendermale donorF3:gendermale donorF5:gendermale donorF7:gendermale donorF8:gendermale donorM1:gendermale donorM4:gendermale donorM6:gendermale
> Warning message:
> Partial NA coefficients for 54675 probe(s)
>> fit2 <- eBayes(fit)
> Warning message:
> In ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim) :
>  Estimation of var.prior failed - set to default value
>> write.fit(fit,file="effects.txt",digits=30, adjust="BH", method="separate",sep="\t")
> Warning message:
> In ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim) :
>  Estimation of var.prior failed - set to default value
>> sessionInfo()
> R version 2.10.1 (2009-12-14)
> x86_64-unknown-linux-gnu
> locale:
> [1] LC_CTYPE=en_US.iso885915       LC_NUMERIC=C
> [3] LC_TIME=en_US.iso885915        LC_COLLATE=en_US.iso885915
> [5] LC_MONETARY=C                  LC_MESSAGES=en_US.iso885915
> [7] LC_PAPER=en_US.iso885915       LC_NAME=C
> [9] LC_ADDRESS=C                   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
> other attached packages:
> [1] limma_3.2.3
>>
>
>
>
>> Date: Wed, 8 Feb 2012 14:23:44 +1100
>> From: smyth at wehi.EDU.AU
>> To: genome1976 at hotmail.com
>> CC: naomi at stat.psu.edu; bioconductor at r-project.org
>> Subject: Re: [BioC] Coefficients not estimable
>>
>> Or perhaps as a nested fixed model
>>
>> model.matrix(~gender+gender:donor+day)
>>
>> Whether to fit donor as a fixed term or as a block really depends on what
>> questions you want to answer and, in particular, what you want to
>> interpret from a donor "main effect".
>>
>> Best wishes
>> Gordon
>>
>> ---------------------------------------------
>> Professor Gordon K Smyth,
>> Bioinformatics Division,
>> Walter and Eliza Hall Institute of Medical Research,
>> 1G Royal Parade, Parkville, Vic 3052, Australia.
>> smyth at wehi.edu.au
>> http://www.wehi.edu.au
>> http://www.statsci.org/smyth
>>
>> On Tue, 7 Feb 2012, Naomi Altman wrote:
>>
>>> The problem is that donor is nested within gender. The only way I am aware
>>> of to handle this in limma is to use donor as a block.
>>>
>>> Regards,
>>> Naomi
>>>
>>>
>>> At 11:24 AM 2/7/2012, somnath bandyopadhyay wrote:
>>>
>>>
>>>> Dear Gordon and limma users,
>>>>
>>>> I am trying to fit a linear model to assess variability in healthy donors
>>>> over days for both genders using Affymetrix microarray data. I am
>>>> interested in assessing variability arising from geneder, day of blood
>>>> donated and individual donors. When I tried to do a linear fit in using
>>>> limma, I am not able to assess the coefficients for gender.System is
>>>> computationally singular for the gender main effect. Is there any way to
>>>> assess the variability arising from all three main effects? Any help would
>>>> be greatly appreciated.
>>>>
>>>> Following is my target file:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Array_NUID
>>>> gender
>>>> day
>>>> donor
>>>>
>>>> NUID.0000.0133.2260
>>>> male
>>>> d1
>>>> M1
>>>>
>>>> NUID.0000.0133.2261
>>>> male
>>>> d2
>>>> M1
>>>>
>>>> NUID.0000.0133.2262
>>>> male
>>>> d8
>>>> M1
>>>>
>>>> NUID.0000.0133.2230
>>>> female
>>>> d1
>>>> F2
>>>>
>>>> NUID.0000.0133.6809
>>>> female
>>>> d2
>>>> F2
>>>>
>>>> NUID.0000.0133.2251
>>>> female
>>>> d8
>>>> F2
>>>>
>>>> NUID.0000.0133.2228
>>>> female
>>>> d15
>>>> F2
>>>>
>>>> NUID.0000.0133.2229
>>>> female
>>>> d28
>>>> F2
>>>>
>>>> NUID.0000.0133.2231
>>>> female
>>>> d1
>>>> F3
>>>>
>>>> NUID.0000.0133.2232
>>>> female
>>>> d2
>>>> F3
>>>>
>>>> NUID.0000.0133.2233
>>>> female
>>>> d8
>>>> F3
>>>>
>>>> NUID.0000.0133.2234
>>>> female
>>>> d15
>>>> F3
>>>>
>>>> NUID.0000.0133.2265
>>>> male
>>>> d1
>>>> M4
>>>>
>>>> NUID.0000.0133.2267
>>>> male
>>>> d8
>>>> M4
>>>>
>>>> NUID.0000.0133.2268
>>>> male
>>>> d15
>>>> M4
>>>>
>>>> NUID.0000.0133.2269
>>>> male
>>>> d28
>>>> M4
>>>>
>>>> NUID.0000.0133.2236
>>>> female
>>>> d1
>>>> F5
>>>>
>>>> NUID.0000.0133.2237
>>>> female
>>>> d2
>>>> F5
>>>>
>>>> NUID.0000.0133.2238
>>>> female
>>>> d8
>>>> F5
>>>>
>>>> NUID.0000.0133.6810
>>>> female
>>>> d15
>>>> F5
>>>>
>>>> NUID.0000.0133.2240
>>>> female
>>>> d28
>>>> F5
>>>>
>>>> NUID.0000.0133.2270
>>>> male
>>>> d1
>>>> M6
>>>>
>>>> NUID.0000.0133.2271
>>>> male
>>>> d2
>>>> M6
>>>>
>>>> NUID.0000.0133.2272
>>>> male
>>>> d8
>>>> M6
>>>>
>>>> NUID.0000.0133.2273
>>>> male
>>>> d15
>>>> M6
>>>>
>>>> NUID.0000.0133.2274
>>>> male
>>>> d28
>>>> M6
>>>>
>>>> NUID.0000.0133.2241
>>>> female
>>>> d1
>>>> F7
>>>>
>>>> NUID.0000.0133.2242
>>>> female
>>>> d2
>>>> F7
>>>>
>>>> NUID.0000.0133.2243
>>>> female
>>>> d8
>>>> F7
>>>>
>>>> NUID.0000.0133.2244
>>>> female
>>>> d15
>>>> F7
>>>>
>>>> NUID.0000.0133.2245
>>>> female
>>>> d28
>>>> F7
>>>>
>>>> NUID.0000.0133.2246
>>>> female
>>>> d1
>>>> F8
>>>>
>>>> NUID.0000.0133.2247
>>>> female
>>>> d2
>>>> F8
>>>>
>>>> NUID.0000.0133.2248
>>>> female
>>>> d8
>>>> F8
>>>>
>>>> NUID.0000.0133.2249
>>>> female
>>>> d15
>>>> F8
>>>>
>>>> NUID.0000.0133.2250
>>>> female
>>>> d28
>>>> F8
>>>>
>>>>
>>>> Following is the code I am using:
>>>> library("limma")
>>>> rna <- read.table('qced_non_redundant_rma_data.txt', header=T, row.names=1,
>>>> sep= "\t")
>>>> annotation <- read.table('Test_Samples_Annotation.txt', header=T,
>>>> row.names=1, sep= "\t")
>>>> ## rearrange the annotations to conform to the RMA data
>>>> annotation <- annotation[colnames(rna),]
>>>> ## check that the annotations and data match
>>>> if (sum(sort(rownames(annotation)) == sort(colnames(rna))) !=
>>>> length(rownames(annotation))) {
>>>> stop ("ERROR: annotation rownames and RMA colnames do not match")
>>>> }
>>>> donor <- factor(annotation$donor)
>>>> day <- factor(annotation$day)
>>>> gender <- factor(annotation$gender)
>>>> design <- model.matrix(~ donor + gender + day, data=annotation)
>>>> fit <- lmFit(rna,design)
>>>> fit2 <- eBayes(fit)
>>>> write.fit(fit,file="effects.txt",digits=30, adjust="BH",
>>>> method="separate",sep="\t")
>>>>
>>>>
>>>>
>>>> Following is my sessionInfo:
>>>>> sessionInfo()
>>>> R version 2.10.1 (2009-12-14)
>>>> x86_64-unknown-linux-gnu
>>>> locale:
>>>> [1] LC_CTYPE=en_US.iso885915 LC_NUMERIC=C
>>>> [3] LC_TIME=en_US.iso885915 LC_COLLATE=en_US.iso885915
>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.iso885915
>>>> [7] LC_PAPER=en_US.iso885915 LC_NAME=C
>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>>> [11] LC_MEASUREMENT=en_US.iso885915 LC_IDENTIFICATION=C
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods base
>>>> other attached packages:
>>>> [1] limma_3.2.3
>>>> loaded via a namespace (and not attached):
>>>> [1] tools_2.10.1
>>>> [[alternative HTML version deleted]]
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>
>>>
>>
>> ______________________________________________________________________
>> The information in this email is confidential and intended solely for the addressee.
>> You must not disclose, forward, print or use it without the permission of the sender.
>> ______________________________________________________________________
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list