[BioC] EdgeR: paired samples together with independant samples

Maria Keays mkeays at ebi.ac.uk
Tue Nov 6 10:19:08 CET 2012


Hello,

I read this thread and related user guide material with interest because 
I am working with a very similar data set with paired samples. However, 
I'm having trouble which I think stems from my data being unbalanced? I 
have four patients with a disease and three without, and within that for 
some patients I have replicates but for others I do not. I've created a 
design matrix as described on p32 of the 27 October 2012 edgeR user's 
guide, but when I try to estimate the common dispersion using 
estimateGLMCommonDisp() it tells me:

"Error in glmFit.default(y, design = design, dispersion = dispersion, 
offset = offset) :
   Design matrix not of full rank.  The following coefficients not 
estimable:
  DiseaseHealthy:Patient4"

I guess because I have 4 patients in the diseased set and only 3 in the 
healthy set? If I remove Patient4 and try again, I'm able to continue 
the analysis successfully, but I'd obviously like to be able to include 
all the data -- is that possible? If so, could you explain how to do it?

The original annotations for my data are below:

Disease    Patient    Treatment
disease1    1    control
disease1    1    control
disease1    1    control
disease1    2    control
disease1    3    control
disease1    3    control
disease1    4    control
disease1    1    treat
disease1    1    treat
disease1    1    treat
disease1    2    treat
disease1    3    treat
disease1    3    treat
disease1    4    treat
healthy    5    control
healthy    6    control
healthy    6    control
healthy    6    control
healthy    7    control
healthy    7    control
healthy    5    treat
healthy    6    treat
healthy    6    treat
healthy    6    treat
healthy    7    treat
healthy    7    treat

As I was following the user's guide I amended the "Patient" labels so it 
looked like this when I created the design matrix:

Disease    Patient    Treatment
disease1    1    control
disease1    1    control
disease1    1    control
disease1    2    control
disease1    3    control
disease1    3    control
disease1    4    control
disease1    1    treat
disease1    1    treat
disease1    1    treat
disease1    2    treat
disease1    3    treat
disease1    3    treat
disease1    4    treat
healthy    1    control
healthy    2    control
healthy    2    control
healthy    2    control
healthy    3    control
healthy    3    control
healthy    1    treat
healthy    2    treat
healthy    2    treat
healthy    2    treat
healthy    3    treat
healthy    3    treat

Thanks!
Maria


On 25/10/2012 06:18, Gordon K Smyth wrote:
> Dear Anna,
>
> You are right to recognise that the analysis of this sort of design is 
> more complex than many other experiments, because it includes 
> comparisons both within and between patients.  I have included a new 
> section in the edgeR User's Guide based on your experiment that 
> describes the analysis. This will appear in the official release of 
> edgeR in a couple of days. In the meantime, see pages 31-33 of:
>
>   http://bioinf.wehi.edu.au/software/edgeR/edgeRUsersGuide.pdf
>
> Best wishes
> Gordon
>
>> Date: Tue, 23 Oct 2012 06:37:44 -0700 (PDT)
>> From: "anna [guest]" <guest at bioconductor.org>
>> To: bioconductor at r-project.org, m.nadira at yahoo.fr
>> Subject: [BioC] EdgeR: paired samples together with independant
>>     samples
>>
>>
>> Hello,
>> I am using EdgeR to analyse my RNAseq data.
>>
>> I have:
>>
>> cells from 3 healthy patients , either treated or not with a hormone .
>>
>> cells from 3 patients with disease D1, either treated or not with the 
>> hormone
>>
>> cells from 3 patients with disease D2, either treated or not with the 
>> hormone.
>>
>> I would like to know what is wrong in the response to the hormone in 
>> patients with disease D1 and D2.
>>
>> I don't know how to combine paired comparisons, with pairwise 
>> comparisons, in a unique glm analysis.
>>
>> thank you very much,
>> anna
>>
>> -- output of sessionInfo():
>>
>> R version 2.15.1 (2012-06-22)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=French_France.1252  LC_CTYPE=French_France.1252
>> [3] LC_MONETARY=French_France.1252 LC_NUMERIC=C
>> [5] LC_TIME=French_France.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods base
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.15.1
>>
>
> ______________________________________________________________________
> The information in this email is confidential and intend...{{dropped:4}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list