[BioC] ReadAffy gives Error

Jarno Tuimala jtuimala at csc.fi
Thu Apr 5 08:11:55 CEST 2007


Hello!

I tested the memory usage on a Windows machine with 9 hgu133plus2 chips. 
On disk these files took 284 MBs. After reading the data in R, the stored 
AffyBatch consumed about 100 MBs of memory. But, during the construction 
of an AffyBatch object using the ReadAffy() command, R consumed a maximum 
of about 700 MBs. During the normalization, the memory need was much 
lower, about 200 MBs.

I'm not sure whether you can extrapolate from these results, but assuming 
you can, this would mean that you would need about 7 GBs of memory in 
order to be able to load all 88 chips at the same time using ReadAffy().

Best regards,
Jarno



On Wed, 4 Apr 2007, Kasper Daniel Hansen wrote:

> I have no idea as the the RAM usage, but you could try to go the
> route of reading in the expression matrix as Jim said and then
> manually construct the AffyBatch. You can also use affxparser for
> this step which might be even less memory hungry.
>
> I agree that a failure to read in the data does not look good for the
> QC stuff. .
>
> Kasper
>
> On Apr 4, 2007, at 1:15 PM, Boel Brynedal wrote:
>
>> Dear all,
>>
>> How much RAM is needed to read and analyze 88 hgu133plus2 arrays?
>> As I've understood it, the actual ReadAffy() part would not be a
>> problem, but the normalization. In this case I want to do all of the
>> quality controls, I want the AffyBatches.
>> I had the impression that 4GB would be enough.
>>
>> Best,
>> Boel
>>
>> On Wed, 2007-04-04 at 09:50 -0400, James W. MacDonald wrote:
>>> Boel Brynedal wrote:
>>>>>> Error: cannot allocate vector of size 931491 Kb
>>>>>
>>>>> This error indicates that you need more RAM.
>>>>
>>>>
>>>> But I have 4GB of RAM, shouldn't that be enough?
>>>
>>> Depends on what kind of chip you are using. It might work for older
>>> chips (e.g., hgu95av2), but probably not for the current
>>> generation of
>>> 3' arrays (e.g., hgu133plus2).
>>>
>>>> Is there a limitation for how much memory R can use? And, if
>>>> there is,
>>>> how can I change this?
>>>
>>> There are limits on the size of objects, but you will not be hitting
>>> that here. On Linux R will take all the memory it requires without
>>> any
>>> intervention by you, so if you are getting this error you have hit
>>> the
>>> wall. Are you doing other memory-hungry things concurrently?
>>>
>>> There are ways around this that don't require purchasing RAM.
>>> First, you
>>> can use justRMA() which will undoubtedly be able to process all your
>>> chips. The downside is no AffyBatch, so you can't do QA plots of
>>> the raw
>>> data.
>>>
>>> Another alternative is to use read.probematrix(), which will read in
>>> just the PM and/or MM probes. You can use these data for quality
>>> assessment, etc, but you will be missing all the niceties that
>>> come with
>>> using an AffyBatch.
>>>
>>>>
>>>>
>>>>>> Error in isVersioned(object) : error in evaluating the argument
>>>>>> 'object'
>>>>>> in selecting a method for function 'isVersioned'
>>>>>
>>>>> Not sure about this one. It may just be an artifact of the first
>>>>> error,
>>>>> or indicate a mismatch in your package versions. How did you
>>>>> install the
>>>>> BioC packages? What is your sessionInfo()?
>>>>
>>>>
>>>> Bioconductor was installed using biocLite(), other packages where
>>>> also
>>>> downloaded and installed (using i.e. R CMD INSTALL simpleaffy).
>>>
>>> You should use biocLite() for all package installation. If you
>>> just grab
>>> things and install directly you always run the risk that you are
>>> installing something that is an incorrect version for the version of
>>> R/BioC that you have. Using biocLite() ensures that you get the
>>> correct
>>> thing.
>>>
>>> For instance, simpleaffy 2.4.2 is not the correct version for use
>>> with
>>> BioC 1.9. You should have 2.8.0. This doesn't explain the isVersioned
>>> error, as your affy/Biobase/affyio are all correct versions. It is
>>> probably just because you ran out of memory.
>>>
>>> Best,
>>>
>>> Jim
>>>
>>>
>>>
>>>
>>>>
>>>> This is my sessionInfo()
>>>>
>>>>> sessionInfo()
>>>>
>>>> R version 2.4.1 (2006-12-18)
>>>> x86_64-unknown-linux-gnu
>>>>
>>>> locale:
>>>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_
>>>> US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en
>>>> _US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US
>>>> .UTF-8;LC_IDENTIFICATION=C
>>>>
>>>> attached base packages:
>>>>  [1] "grid"      "splines"   "tools"     "stats"     "graphics"
>>>> "grDevices"
>>>>  [7] "utils"     "datasets"  "methods"   "base"
>>>>
>>>> other attached packages:
>>>>  simpleaffy  genefilter    survival     IDPmisc     lattice
>>>> affyPLM
>>>>     "2.4.2"    "1.12.0"      "2.30"     "0.9.1"   "0.14-16"
>>>> "1.10.0"
>>>>       gcrma matchprobes    affydata        affy      affyio
>>>> Biobase
>>>>     "2.6.0"     "1.6.0"    "1.10.0"    "1.12.2"     "1.2.0"
>>>> "1.12.2"
>>>>
>>>> I can read 4 CEL files without any problems, so maybe this is a
>>>> memory
>>>> problem all together, but I really thought 4 GB of RAM would be
>>>> enough.
>>>>
>>>> Thankful for any advice,
>>>> Boel
>>>>
>>>>> Best,
>>>>>
>>>>> Jim
>>>>
>>>>
>>>>>> Any suggestions to what is wrong?
>>>>>> As you might imagine, I am quite new in this field.
>>>>>>
>>>>>> Best regards,
>>>>>> Boel Brynedal, PhD student, Karolinska Institutet, Sweden.
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioconductor mailing list
>>>>>> Bioconductor at stat.math.ethz.ch
>>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>>> Search the archives: http://news.gmane.org/
>>>>>> gmane.science.biology.informatics.conductor
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/
>> gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list