[BioC] DESeq Run Unequal Sample Size
Simon Anders
anders at embl.de
Wed Jul 23 10:10:57 CEST 2014
Dear Gihanna
On 23/07/14 03:53, Gihanna Galindez wrote:
> Hi Dr. Anders, I would just like to consult about our Illumina run. We have
> two groups of samples from a non-model organism. Group A organisms are
> considerably larger than Group B, which are as small as a dot. As a
> result, one QIAgen RNeasy extraction from from Group B requires a larger
> number of samples. For each group of samples we have 4 libraries from 4
> corresponding extractions. Thus, we have a total of 8 libraries. All
> libraries from Group A have n=7. On the other hand, all libraries from
> Group B have n=21. Given the unequal sample size from each library, I would
> like to ask if differential expression analysis between Groups A and B will
> still be valid?
This depends a lot on what you mean by "differentailly expressed".
In a somewhat trivial sense, all genes will be expressed much more
strongly in Group A than in Group B. After all, if a Group-A organism is
so much larger, it will contain way more transcript molecules than a
group-B organism for most if not all genes.
You won't see this in RNA-Seq data, though, because the number of reads
you get out of a library does not depend on the number of mRNA molecules
that went into the library prep, only on the way the flow cell was seeded.
You are probably not interested in seeing this, either. It won't tell
you anything you did not know yet.
What you might be interested is which transcripts' abundance, as seen in
relation to the other genes in the same cell, depends on the group. The
normalization procedure of DESeq2 aim to chose size factors (i.e.,
scaling factors for normalization) such that most genes or "average"
genes seem to stay unchanged. hence, you will find genes whose ratio
between these two groups deviates from the overall trend caused by the
size difference. If this is what you want, you are fine.
But make sure to have close look at the MA plot.
Simon
More information about the Bioconductor
mailing list