[Bioc-sig-seq] PDict question

Robert Gentleman rgentlem at fhcrc.org
Tue Jun 3 18:39:38 CEST 2008


Hi,

  Whether you have enough RAM will be a function of lots of things - 
which genome you are matching too (some are larger than others), how 
careful  you are about dropping sequences when they are not needed etc. 
  And as you have not provide those details it is not possible to give 
more concrete advice.

You could try breaking the sequences down into smaller subsets, say 
build a PDict on half of the data - match, then repeat on the other half 
(or thirds or whatever size does work for your architecture).  In 
general, you may want to consider moving the analysis to a Linux box 
with more RAM (we typically use machines with 32 or 64 GB of RAM - which 
are surprisingly inexpensive these days).

   best wishes
    Robert



Stephen Henderson wrote:
> Hi Joseph
> You look like you should have enough RAM on your MacPro. Have you compiled a 64-bit version of R for the Mac? The CRAN binaries are 32-bit and will restrict the available memory.
>  
> Stephen
>  
> 
> ________________________________
> 
> From: bioc-sig-sequencing-bounces at r-project.org on behalf of Joseph Dhahbi, P.h.D.
> Sent: Tue 03/06/2008 16:21
> To: bioc-sig-sequencing at r-project.org
> Cc: bioc-sig-sequencing at r-project.org
> Subject: [Bioc-sig-seq] PDict question
> 
> 
> 
> Hello
> I need help on how to get around the memory error reported
> below, especially when I can not add anymore RAM:
> Here is the Hardware Overview:
>    Model Name:  Mac Pro
>    Model Identifier:    MacPro1,1
>    Processor Name:      Dual-Core Intel Xeon
>    Processor Speed:     2.66 GHz
>    Number Of Processors:        2
>    Total Number Of Cores:       4
>    L2 Cache (per processor):    4 MB
>    Memory:      20 GB
>    Bus Speed:   1.33 GHz
>    Boot ROM Version:    MP11.005C.B08
>    SMC Version: 1.7f10
>    Serial Number:       G87052SGUPZ
> 
> 
> 
>> NM_seq=readSolexaFastA(NM_fa)
>> NM_alf=alphabetFrequency(NM_seq, baseOnly=TRUE)
>> NM_seq_clean = NM_seq[NM_alf[,"other"]==0]
>> length(NM_seq)
> [1] 4820218
>> length(NM_seq_clean)
> [1] 4817537
>> NM_seq_clean
>    A DNAStringSet instance of length 4817537
>            width seq
>        [1]    36 GATCGGAAGAGCTCGTATGCCGTCTTCTGCTTGGAT
>        [2]    36 GTGGTAATTCATCAGATCTCGGATGGCATTGGTCAT
>        [3]    36 GGGAGGTCACTAATGGAGACACACAGAAATGTAACA
>        [4]    36 GGGATTGGTTTTTTGTTACTGATTTGTTTGAGTTCA
>        [5]    36 GTGGTAATTTTGACTTTTTAGGTTAATTTATTTTTT
>        [6]    36 GATCGGAAGGAGCTCGTATGCCGTCTTCTGCTTAGA
>        [7]    36 GGTCAGTTGTGTTCTCCTGAGTAGGTTGTGTGAATG
>        [8]    36 GGGAGGTCACTAATGGAGACACACAGAAATGTAACA
>        [9]    36 GGGAGGCTGAGGCAGGAGAATGGCATGAACCTAGAT
>        ...   ... ...
> [4817529]    36 TTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAG
> [4817530]    36 CATCAATGTATCTTAAGGCGTAAATTGTAAGCGTTA
> [4817531]    36 CGAGCAGCGACGCATCACCCAGCTAGATCGGAAGAG
> [4817532]    36 GCAATGCCACTGGCGCGACAACCGGGACACCATAGG
> [4817533]    36 CCTCGCCGGACACGCTGAACTTGTGGCCGTTTTCGT
> [4817534]    36 CCATTGTACAACGTATCGACATATCCTCCACCCGCC
> [4817535]    36 CCCCCTGAACCTGAAACATAAAATGAATGCAATTGT
> [4817536]    36 ACCATGTTGTCCAAGGGCGAATTCTGCAGATATCCA
> [4817537]    36 CAGGGGCCGGCGGCTGGCTAGGGCTGCAGCGTTAAA
> 
>> NM_seq_pDict=PDict(NM_seq_clean)
> Error in .PDict(dict, names(dict), tb.start, tb.end,
> drop.head, drop.tail,  :
>    alloc_actree_nodes_buf(): failed to alloc
> actree_nodes_buf
> R(433,0xa000d000) malloc: *** vm_allocate(size=4032987136)
> failed (error code=3)
> R(433,0xa000d000) malloc: *** error: can't allocate region
> R(433,0xa000d000) malloc: *** set a breakpoint in
> szone_error to debug
> 
>> sessionInfo()
> R version 2.7.0 (2008-04-22)
> i386-apple-darwin8.10.1
> 
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> 
> attached base packages:
> [1] tools     stats     graphics  grDevices utils
>     datasets  methods   base
> 
> other attached packages:
> [1] BiostringsCinterfaceDemo_0.1.2 Biostrings_2.8.9
>               Biobase_2.0.1
> 
> 
> 
> 
> Regards,
> Joseph
> 
> Joseph M. Dhahbi, PhD
> Childrens Hospital Oakland Research Institute
> 5700 Martin Luther King Jr. Way
> Oakland, CA 94609
> USA
> Ph.(510)428-3885 EXT.5743
> Cell.(702)335-0795
> Fax (510)450-7910
> jdhahbi at chori.org
>  The email message (and any attachments) is for the sole...{{dropped:21}}
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

-- 
Robert Gentleman, PhD
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
PO Box 19024
Seattle, Washington 98109-1024
206-667-7700
rgentlem at fhcrc.org



More information about the Bioc-sig-sequencing mailing list