[BioC] Maximum number of CEL files for ReadAffy() in Affy package.
Markus Schmidberger
schmidb at ibe.med.uni-muenchen.de
Wed Jul 23 06:02:16 CEST 2008
Hi,
there is one more solution to handle large data sets: the affyPara
Package (http://www.bioconductor.org/packages/bioc/html/affyPara.html)
You will need a computer cluster and you can do preprocessing in
parallel mode.
If you have enough computers you can preprocess unlimited numbers of
arrays and you will get a good speedup in computation time.
I think for 2000 arrays 5-6 computers with 4 GB should be enough
(depending on the chip type).
Best
Markus
Hailong Cui schrieb:
> Dear all,
>
> First, I apologize for the mass email. I've been reading manuals, googling,
> searching the archive of the mailing list, but still cannot find an exact
> answer to my problem.
>
> (1) Question: Can a large number of CEL files cause an overflow for the
> function ReadAffy() in the affy packages? Is there any way to fix this?
> Other options seem to be other software RMAExpress and dChip in WindowsXP.
> Any suggestions?
>
> (2) Background: What I am trying to do is to read in all the CEL files in
> the directory to create an AffyBatch object, so that I can use functions in
> the affy package. To be more specific, I want to do RMA, dChip normalization
> and get MAplots. In my workstation (48 64-bit CPUs, 500Gb memory),
> ReadAffy() worked fine for 241 CEL files, but when I moved on to 2,035 CEL
> files, it failed and kept showing the error message below. The number of
> rows for the CEL file is roughly 50k. On the bright side, I tried justRMA()
> and got the expression values in the text format.
>
>
>> R
>> library(affy)
>> Data <- ReadAffy()
>>
> Error in read.affybatch(filenames = l$filenames, phenoData
> = l$phenoData, :
> allocMatrix: too many elements specified
>
>
> FYI, below is the session information on my workstation.
>
>
>> sessionInfo()
>>
> R version 2.7.1 (2008-06-23)
> ia64-unknown-linux-gnu
>
> locale:
> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] geneplotter_1.18.0 annotate_1.18.0
> [3] xtable_1.5-2 AnnotationDbi_1.2.2
> [5] RSQLite_0.6-9 DBI_0.2-4
> [7] lattice_0.17-8 BufferedMatrixMethods_1.4.0
> [9] BufferedMatrix_1.4.0 affy_1.18.2
> [11] preprocessCore_1.2.0 affyio_1.8.0
> [13] Biobase_2.0.1
>
> loaded via a namespace (and not attached):
> [1] grid_2.7.1 KernSmooth_2.22-22 RColorBrewer_1.0-2
>
>
>
>
> Thank you so much for reading this and I would appreciate your reply.
>
> Hailong
>
>
>
More information about the Bioconductor
mailing list