[BioC] Good practice for choosing biological groups to include in array analysis
Salvador
salvador at bio.bsu.by
Fri Aug 17 12:46:50 CEST 2012
Dear listers,
Apologies for asking a general statistics question here, but maybe
someone will be willing to help me.
I'm analysing an Illumina dataset that comes from 4 biological groups.
For simplicity let's call them:
Control
Drug A
Drug B - Concentration 1
Drug B - concentration 2
Each group has 4 biological replicates and they were hybridised across 2
chips so that each chip had 2 samples from each group. In terms of
biological questions asked, Drug A is being compared to Control. And two
concentrations of Drug B are compared to control as well as to each
other. So Drug A is never compared to Drug B.
As far as I understand, for comparing Drug A to Control I have two options:
1) Extract data for Drug A and Control from the dataset and run a linear
model on those;
2) Run a linear model on samples from all groups and set up contrasts to
compare Drug A to Control.
Naturally, the second option has a higher number of experimental units,
which brings variation down and results in more differentially expressed
genes being detected between Drug A and Control.
Now my question is, is there anything wrong (ethically, statistically,
etc) with the second option?
Many thanks for your help!
Aliaksei.
More information about the Bioconductor
mailing list