[BioC] bout big data set for Affy R packge
Rob Dunne
Rob.Dunne at csiro.au
Sat Dec 22 05:31:51 CET 2012
Hi Benilton,
Unless I am missing something, ff wont help in this case. From the ff
help page
"Currently ff objects cannot have length zero and are limited to
‘.Machine$integer.max’ elements"
and .Machine$integer.max is 2^(31)-1. This is exceeded when you try to
load 328 Affy exon arrays hence
library(ff)
library(oligo)
data<-read.celfiles(filenames=files)
#Loading required package: pd.huex.1.0.st.v2
#Loading required package: RSQLite
#Loading required package: DBI
#Platform design info loaded.
#Error in if (length < 0 || length > .Machine$integer.max) stop("length
must be between 1 and .Machine$integer.max") :
# missing value where TRUE/FALSE needed
#In addition: Warning message:
#In ff(initdata = initdata, vmode = vmode, dim = dim, pattern =
file.path(ldPath(), :
# NAs introduced by coercion
traceback()
#4: ff(initdata = initdata, vmode = vmode, dim = dim, pattern =
file.path(ldPath(),
# basename(name)))
#3: createFF("intensities-", dim = c(nr, length(filenames)))
#2: smartReadCEL(filenames, sampleNames, headdetails = headdetails)
#1: read.celfiles(filenames = ff)
This is why I went done the path of modifying read.celfiles to use
big.matrix, which does not have the 2^(31)-1
limit
Bye
Rob
sessionInfo()
#R version 2.15.0 (2012-03-30)
#Platform: x86_64-unknown-linux-gnu (64-bit)
#
#locale:
# [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C
# [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8
# [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8
# [7] LC_PAPER=C LC_NAME=C
# [9] LC_ADDRESS=C LC_TELEPHONE=C
#[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] tools stats graphics grDevices utils datasets methods
#[8] base
#
#other attached packages:
#[1] pd.huex.1.0.st.v2_3.6.0 RSQLite_0.11.2 DBI_0.2-5
#[4] oligo_1.20.4 oligoClasses_1.18.0 ff_2.2-10
#[7] bit_1.1-9
#
#loaded via a namespace (and not attached):
# [1] affxparser_1.28.1 affyio_1.24.0 Biobase_2.16.0
# [4] BiocGenerics_0.2.0 BiocInstaller_1.4.9 Biostrings_2.24.1
# [7] codetools_0.2-8 compiler_2.15.0 foreach_1.4.0
#[10] IRanges_1.14.4 iterators_1.0.6 preprocessCore_1.18.0
#[13] splines_2.15.0 stats4_2.15.0 zlibbioc_1.2.0
On 12/21/2012 10:45 PM, Benilton Carvalho wrote:
> Hi Rob,
>
> looks like you're running an old version of oligo.
>
> Today, our approach is:
>
> library(ff)
> library(oligo)
> my.data <- read.celfiles(<CEL file names>)
>
> HTH,
> b
>
> On 21 December 2012 01:02, Rob Dunne <Rob.Dunne at csiro.au> wrote:
>> Hi Wei Liu,
>>
>> if they are affymetrix 1.0 ST exon arrays, I can send you a modified version of read.celfiles from the oligo package that
>> should read a 300 microarray data set. I dont know it it will work for other array types, possibly not without some work.
>> It is a modified version of the read.celfiles that uses the big.matrix class from the big.memory package
>>
>> my.data<-read.celfiles(filenames=ff,useAffyio=FALSE)
>> my. data
>> #assayData: 6553600 features, 335 samples
>> #Annotation: pd.huex.1.0.st.v2
>>
>> Bye
>> Rob
>>
>>
>>
>>
>> On 12/20/2012 01:21 AM, 刘伟 wrote:
>>> Dear Buddy,
>>> I am a user of affy R package. When I attempt to handle a large
>>> number (aprox. 300) of microarrays, I always get an error in memory
>>> allocation from R. I searched the web but didnot find any solution for
>>> readaffy() with large dataset. I donnot know if the problem can be
>>> fixed in some way. Any suggestion is appreciated. Thanks.
>>>
>>> Sincerely,
>>> Wei Liu
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> -
>> Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263
>> CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100
>> Locked Bag 17, North Ryde, New South Wales, Australia, 1670
>> http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au
>>
>> Java has certainly revolutionized marketing and litigation.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
-
Rob Dunne Fax: +61 2 9325 3200 Tel: +61 2 9325 3263
CSIRO Mathematics, Informatics and Statistics +61 2 9325 3100
Locked Bag 17, North Ryde, New South Wales, Australia, 1670
http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au
Java has certainly revolutionized marketing and litigation.
More information about the Bioconductor
mailing list