[BioC] bout big data set for Affy R packge

Rob Dunne Rob.Dunne at csiro.au
Sat Dec 22 05:31:51 CET 2012


Hi Benilton,

Unless I am missing something, ff wont help in this case. From the ff 
help page

"Currently ff objects cannot have length zero and are limited to 
‘.Machine$integer.max’ elements"

and .Machine$integer.max is  2^(31)-1. This is exceeded when you try to 
load 328 Affy exon arrays hence

library(ff)
library(oligo)
data<-read.celfiles(filenames=files)
#Loading required package: pd.huex.1.0.st.v2
#Loading required package: RSQLite
#Loading required package: DBI
#Platform design info loaded.
#Error in if (length < 0 || length > .Machine$integer.max) stop("length 
must be between 1 and .Machine$integer.max") :
#  missing value where TRUE/FALSE needed
#In addition: Warning message:
#In ff(initdata = initdata, vmode = vmode, dim = dim, pattern = 
file.path(ldPath(),  :
#  NAs introduced by coercion

  traceback()
#4: ff(initdata = initdata, vmode = vmode, dim = dim, pattern = 
file.path(ldPath(),
#       basename(name)))
#3: createFF("intensities-", dim = c(nr, length(filenames)))
#2: smartReadCEL(filenames, sampleNames, headdetails = headdetails)
#1: read.celfiles(filenames = ff)

This is why I went done the path of modifying read.celfiles to use 
big.matrix, which does not have the  2^(31)-1
limit

Bye
Rob






sessionInfo()
#R version 2.15.0 (2012-03-30)
#Platform: x86_64-unknown-linux-gnu (64-bit)
#
#locale:
# [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C
# [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8
# [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8
# [7] LC_PAPER=C                 LC_NAME=C
# [9] LC_ADDRESS=C               LC_TELEPHONE=C
#[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] tools     stats     graphics  grDevices utils     datasets methods
#[8] base
#
#other attached packages:
#[1] pd.huex.1.0.st.v2_3.6.0 RSQLite_0.11.2 DBI_0.2-5
#[4] oligo_1.20.4            oligoClasses_1.18.0 ff_2.2-10
#[7] bit_1.1-9
#
#loaded via a namespace (and not attached):
# [1] affxparser_1.28.1     affyio_1.24.0 Biobase_2.16.0
# [4] BiocGenerics_0.2.0    BiocInstaller_1.4.9 Biostrings_2.24.1
# [7] codetools_0.2-8       compiler_2.15.0 foreach_1.4.0
#[10] IRanges_1.14.4        iterators_1.0.6 preprocessCore_1.18.0
#[13] splines_2.15.0        stats4_2.15.0 zlibbioc_1.2.0


On 12/21/2012 10:45 PM, Benilton Carvalho wrote:
> Hi Rob,
>
> looks like you're running an old version of oligo.
>
> Today, our approach is:
>
> library(ff)
> library(oligo)
> my.data <- read.celfiles(<CEL file names>)
>
> HTH,
> b
>
> On 21 December 2012 01:02, Rob Dunne <Rob.Dunne at csiro.au> wrote:
>> Hi Wei Liu,
>>
>> if they are affymetrix 1.0 ST exon arrays, I can send you a modified version of read.celfiles from the oligo package that
>> should read a 300 microarray data set. I dont know it it will work for other array types, possibly not without some work.
>>   It is a modified version of the read.celfiles that uses the big.matrix class from the big.memory package
>>
>> my.data<-read.celfiles(filenames=ff,useAffyio=FALSE)
>> my. data
>> #assayData: 6553600 features, 335 samples
>> #Annotation: pd.huex.1.0.st.v2
>>
>> Bye
>> Rob
>>
>>
>>
>>
>> On 12/20/2012 01:21 AM, 刘伟 wrote:
>>> Dear Buddy,
>>> I am a user of affy R package. When I attempt to handle a large
>>> number (aprox. 300) of microarrays, I always get an error in memory
>>> allocation from R. I searched the web but didnot find any solution for
>>> readaffy() with large dataset. I donnot know if the problem can be
>>> fixed in some way. Any suggestion is appreciated. Thanks.
>>>
>>> Sincerely,
>>> Wei Liu
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> --
>> -
>> Rob Dunne         Fax: +61 2 9325 3200     Tel: +61 2 9325 3263
>> CSIRO Mathematics, Informatics and Statistics   +61 2 9325 3100
>> Locked Bag 17, North Ryde, New South Wales, Australia, 1670
>> http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au
>>
>>          Java has certainly revolutionized marketing and litigation.
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor


-- 
-
Rob Dunne         Fax: +61 2 9325 3200     Tel: +61 2 9325 3263
CSIRO Mathematics, Informatics and Statistics   +61 2 9325 3100
Locked Bag 17, North Ryde, New South Wales, Australia, 1670
http://www.bioinformatics.csiro.au Email: Rob.Dunne at csiro.au

         Java has certainly revolutionized marketing and litigation.



More information about the Bioconductor mailing list