[BioC] Combining HGU133A & HGU133B data

Laurent Gautier laurent at cbs.dtu.dk
Mon Sep 15 15:43:17 MEST 2003


On Mon, Sep 15, 2003 at 02:19:01PM +0200, w.huber at dkfz-heidelberg.de wrote:
> 
> Hi
> 
> On Mon, 15 Sep 2003, Laurent Gautier wrote:
> > Wolgang Huber and Robert Gentleman have certainly a word to say about
> > that. Did you check the function 'combine' in the package 'matchprobes'
> > (section 'devel') ?
> 
> The combine function in the matchprobes package is useful for combining
> data from different chip types. The combination is done on the
> probe-level, before normalization, and it requires that there is an
> appreciable overlap in probe sequences (as, for example, with
> hu6800/hgu95av2 or mgu74a/mgu74av2). The combination is based on the
> INTERSECTION of probes that have the same sequence, and from the point of
> view of the expression matrix, it corresponds, loosely speaking, to a
> CBIND.
> 
> What Adaikalavan is looking for is much simpler: something that works on
> the UNION of all probes/genes on HGU133A and HGU133B, and from the point
> of view of the expression matrix corresponds to an RBIND.
> 
> I am not aware of a simpler method for doing this than calling
> new("exprSet", ....) with the arguments patched together from the
> individual two HGU133A and HGU133B exprSets.
> 
> Best regards
>   Wolfgang
> 
> -------------------------------------
> Wolfgang Huber
> Division of Molecular Genome Analysis
> German Cancer Research Center
> Heidelberg, Germany
> Phone: +49 6221 424709
> Fax:   +49 6221 42524709
> Http:  www.dkfz.de/mga/whuber
> -------------------------------------
> 


Ooops... sorry for the confusion (I never used combined (...yet)).

In this case, the union of expression values is a straightforward 'rbind'
as Wolfgang suggests. The probe business is slightly more tricjy because
of the cdfenvs. The following scheme should make it (more or less I
did not test it):

##abatch.a and abatch.b are the AffyBatch objects

abatch.ab <- new("AffyBatch", exprs=rbind(exprs(abatch.a), exprs(abatch.b)), cdfName="cdfenv.ab")


## make a cdfenv for the union-combined-chips
cdfenv.ab <- new.env(hash=TRUE)

cdfenv.a <- getCdfInfo(abatch.a)
for (i in ls(cdfenv.a)) {
  assign(i, get(i, envir=cdfenv.a), envir=cdfenv.ab)
}
offset <- nrow(exprs(abatch.a))
cdfenv.b <- getCdfInfo(abatch.b)
for (i in ls(cdfenv.b)) {
  if (exists(i, envir=cdfenv.a))
    stop(paste(i, ": id already in use !"))
  assign(i, get(i, envir=cdfenv.b)+offset, envir=cdfenv.ab)
}


## from now, this should be like a regular AffyBatch
## (expect quirks with some methods/functions 
## dealing with spatial features of the probes, ex: image) 



Hopin' it helps,


L.


-- 
--------------------------------------------------------------
Laurent Gautier			CBS, Building 208, DTU
PhD. Student			DK-2800 Lyngby,Denmark	
tel: +45 45 25 24 89		http://www.cbs.dtu.dk/laurent



More information about the Bioconductor mailing list