[BioC] read subset of affy data

Sat Oct 18 11:17:00 MEST 2003

On Fri, Oct 17, 2003 at 07:21:10PM -0400, Xuejun Peng wrote:
> I have 100 affy arrays and I got an "out-of-memory" error message when I 
> try to batch-read all of them simultaneously. On the other hand, it is 
                                ^^^^^^^^^^^^^^
> really inefficient to read them one by one.
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Unless you have customized your own parallel computing/threading routines,
the CEL files are read in sequence.

> Since I have a specific list of genes that I want to read, can I use 
> ReadAffy or some other functions to read the subset of genes directly 
> from .cel files? For a simple example, I only want to read  five genes 
> and I know their probe set ID's.

The efficient way is probably to read them one by one (or 10 by 10,
or whatever number your memory let you do), and extract and store
the probe set of interest each time (function 'probeset'). 
You can merge the probe sets (see 'class?ProbeSet')

Example:

l <- list.files(".", "\.cel$") ## list of CEL files
all.ppsets <- vector("list", length=length(l))

for (i in seq(along=l)) {
  abatch <- read.affybatch(filenames=l[i])
  all.ppsets[[i]] <- probeset(abatch, c("foo123_at", "foo456_at"))
}

## now you just have to merge them...

> 
> Can anyone help? Thanks.

I hope this does.

L.

> 
> Xuejun
> 
> 
> -- 
> Xuejun Peng, Ph.D.
> Assistant Staff
> Biostatistics and epidemiology / Wb4
> Cleveland Clinic Foundation
> 9500 Euclid Avenue
> Cleveland, OH 44195
> Phone: 216-444-9958
> Fax: 216-444-8023
> E-mail: xpeng at bio.ri.ccf.org, pengx at ccf.org
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

-- 
--------------------------------------------------------------
Laurent Gautier			CBS, Building 208, DTU
PhD. Student			DK-2800 Lyngby,Denmark	
tel: +45 45 25 24 89		http://www.cbs.dtu.dk/laurent