I am trying to compare find differentially expressed genes between
appendix and colon tumor samples, which have been arrayed
on different platforms.  Namely hgu133a2 and hgu133plus2.
hgu133a2 is a subset of hgu133plus2, and Bioconductor provides
a package, inSilicoMerging, that's supposed to do this, so I
thought it would be straight forward.

First read in my CEL files and normalize them:

> targets_hsu <- readTargets("Hsu-targets.txt")
> targets_kai <- readTargets("kaiser-targets.txt")
> ab_hsu <- ReadAffy(filenames=targets_hsu$FileName)
> ab_kai <- ReadAffy(filenames=targets_kai$FileName)
> eset_hsu <- gcrma(ab_hsu)
> eset_kai <- gcrma(ab_kai)

So far so good.  Now I merge the esets with inSilicoMerging:

library(inSilicoMerging)
> eset <- merge(list(eset_hsu,eset_kai),method="COMBAT")
  INSILICOMERGING: Run COMBAT...
  INSILICOMERGING:   => Found 2 batches
  INSILICOMERGING:   => Found 0 covariate(s)
> dim(eset_hsu)
Features  Samples
   22277       26
> dim(eset_kai)
Features  Samples
   54675       10
> dim(eset)
Features  Samples
   22277       36

This looks like it worked.  I used plotMDS(), and the data are
nicely intermixed as one would hope.   Now I need to do DE
analysis with Limma. Hsu (the first 26 samples) are experimental
and Kai (the last 10 samples) are control.   So I create a design
matrix like this:


> design <- cbind(CTL=1, EXPvsCTL=c(rep(1,26),rep(0,10)))
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> tt<-topTable(fit, coef="EXPvsCTL",number=100000)
> head(tt,n=3)
                  logFC  AveExpr             t P.Value adj.P.Val         B
1007_s_at  1.487227e-15 7.045851  4.223110e-15       1         1 -6.235399
1053_at    7.281216e-16 5.498793  1.127582e-14       1         1 -6.235399
117_at    -8.262373e-16 5.047563 -2.068570e-15       1         1 -6.235399
> tail(tt,n=3)
                        logFC  AveExpr             t P.Value adj.P.Val
AFFX-TrpnX-3_at -3.953675e-16 2.257998 -1.507504e-13       1         1
AFFX-TrpnX-5_at  1.024821e-16 2.257637  4.062598e-14       1         1
AFFX-TrpnX-M_at  1.024821e-16 2.257637  4.062598e-14       1         1
                        B
AFFX-TrpnX-3_at -6.235399
AFFX-TrpnX-5_at -6.235399
AFFX-TrpnX-M_at -6.235399


As you can see, the p values and B statistics are the same for every
probe.  Clearly
something is wrong here.  Did I do something wrong?  Is this sort of thing
expected
when you merge datasets like this?  Any nudges in the right direction would
be
appreciated.
-Ed

	[[alternative HTML version deleted]]