[BioC] Combining two datasets - help to use GeneMeta.

Darlene Goldstein Darlene.Goldstein at epfl.ch
Tue Jun 20 19:51:32 CEST 2006

Robert Gentleman <rgentlem at ...> writes:

> Sean Davis wrote:
> > Sharon wrote:
> >> Hi,
> >>
> >> I am trying to combine two Affy datasets (on rae230a chips), where
> >> experiments done one year apart. In the first dataset, we have 2
> >> strains with each strain treated and untreated.  But for the second
> >> dataset, we have just 2 strains untreated.
> >>
> >> Because of unequal levels in the 2 datasets, I am not able to use
> >> 'getdF'  in GeneMeta as it is.  Any suggestions for using 'getdF' for
> >> this situation?  or any alternate way of combining these 2 datasets?
> > 
> > Are these datasets really that much different that you can't just 
> > combine them?  They may be, but have you looked at affyPLM results, 
> > density plots, etc., just to be sure?  If they aren't that much 
> > different, perhaps you can just normalize them together and move on? 
> > Just asking....
>   Sorry, but that is, IMHO, a bad idea. You should never jointly 
> normalize separate experiments. Normalize separately and use a random 
> effects model for the experiments. As, for how to handle different 
> levels of factors/covariates, the issue then becomes one of what can be 
> estimated from both. Once you identify that you can set up the 
> appropriate model and then use tools like nlme and lmer (depending on 
> the model) to estimate parameters. But this will require some 
> statistical expertise and for that you will have to look locally, these 
> things are too hard to do over the internet,  IMHO.
>   There is a BioC technical report on Synthesis of microarray 
> experiments that outlines some of these details more completely.
>   best wishes
>    Robert

hi, a belated followup on Robert's advice.......it seems to me that the hope
with joint normalization is to remove the different 'study batch' effects.  I
have posted previously on the apparent futility of this:


I have also posted a preprint of the study on which this advice is based:


The bottom line is that these kind of study differences always occur, and that
you don't remove them with joint normalization.  You need to normalize within
study and then combine (and there are several suggestions out there for how to
do the combining).  

Best regards, 


Darlene Goldstein
École Polytechnique Fédérale de Lausanne (EPFL)
Institut de mathématiques
Bâtiment MA, Station 8        Tel: +41 21 693 2552
CH-1015 Lausanne              Fax: +41 21 693 4303

More information about the Bioconductor mailing list