[BioC] BitSeq getExpression crash

Peter Glaus glaus at cs.man.ac.uk
Tue Mar 26 14:28:41 CET 2013


Dear Maayan,
not sure what is the problem as I was not able to replicate this error.
Could you please try:
1) running with verbose=TRUE option, so there will be a bit more output.
2) use option outPrefix. With this option, BitSeq does not use temporary 
directories.
Examples: assuming your working direcotry in R is:
/projects/mkreitzman_prj/expression_quantification_testing/testing/BitSeq
you could run:
getExpression("A08485_gr4.sam_mini.sam", 
"../test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdna.all.fa", 
outPrefix="A08485_gr4.sam_mini.run1", log=TRUE, seed=47, verbose=TRUE);

Which should create 3 file in the directory: 
A08485_gr4.sam_mini.run1.rpkm, A08485_gr4.sam_mini.run1.thetaMeans and 
A08485_gr4.sam_mini.run1.mean.

(You can also use relative/absolute paths in the outPrefix, just make 
sure the directories exist.)

If the run fails again, you can have a look into the directory and see 
what kind of files were created.

Regards,
Peter.

On 25/03/13 22:00, Maayan Kreitzman wrote:
> Hi Peter,
> I'm still having some difficulty with the first step of BitSeq.
> I made myself a mini dataset with just 5 million reads so that I could run everything in my R console without writing and R script and submitting to the cluster (just until I know what I'm doing)
>
> but, I got this error:
> Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) :
>    Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS-27c66ebf4a21.rpkm failed to open.
>
> since it seemed to be an issue with the temp directory, I tried to change my TMPDIR in the shell to somewhere with plenty of space:
> [mkreitzman at xhost09 ~]$ echo $TMPDIR
> /projects/wtss_scratch/maayan
>
> but, this did not make a difference to where the temp files were created.
> I copied the whole session below.
>
> thanks,
> Maayan
>
>
>> res1 <- getExpression("/projects/mkreitzman_prj/expression_quantification_testing/testing/BitSeq/A08485_gr4.sam_mini.sam", "/projects/mkreitzman_prj/expression_quantification_testing/testing/test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdna.all.fa",
> + log = TRUE, seed=47)
> ## Computing alignment probabilities.
> [time: +1.283333 m]
> [time: +0.333333 m]
> [time: +0.000000 m]
> [time: +1.200000 m]
> [time: +0.500000 m]
> [time: +0.000000 m]
> ## Estimating transcript expression levels.
> Mappings: 1606295
> Ntotal: 2408007
>    10000 [time: +0.000000 s]
>    100000 [time: +0.000000 s]
>    1000000 [time: +2.000000 s]
> Finished Reading!
> Total hits = 3212590
> Isoforms: 183986
> Burn in: 1000 DONE. [time: +6.633333 m]
>
> Sampling DONE. [time: +9.600000 m]
> rHat (for 1000 samples)
>     rHat (rHat from subset |    tid | mean theta)
>      1.0040 ( 1.0040 |  36651 |  0.0000)
>      1.0040 ( 1.0072 | 178210 |  0.0000)
>      1.0036 ( 1.0088 | 148680 |  0.0000)
>    Mean rHat of worst 10 transcripts: 1.003527
>    Mean C0: (50 50 50 50 ). Nunmap: 801712
>
> Producing 649 final samples.
>
> Sampling DONE. [time: +6.833333 m]
> rHat (for 649 samples)
>     rHat (rHat from subset |    tid | mean theta)
>      1.0061 ( 1.0051 | 157831 |  0.0000)
>      1.0059 ( 1.0048 | 104363 |  0.0000)
>      1.0058 ( 1.0068 |  71659 |  0.0000)
>    Mean rHat of worst 10 transcripts: 1.005543
>    Mean C0: (50 50 50 50 ). Nunmap: 801712
>
> Total samples: 6596
> ## Computing means.
> Error in getMeanVariance(c(outFile), meanFile, log = log, pretend = pretend) :
>    Conditions: file /tmp/RtmpRHbr1N/A08485_gr4.sam_mini-BS-27c66ebf4a21.rpkm failed to open.
> ________________________________________
> From: Peter Glaus [glaus at cs.man.ac.uk]
> Sent: Friday, March 22, 2013 5:32 AM
> To: Maayan Kreitzman
> Subject: Re: BitSeq getExpression crash
>
> Hi Maayan,
> I believe the error is caused by process running out of memory. I am not
> 100% sure, but when I saw this kinds of errors before, it was caused by
> lack of memory. The estimation can be quite CPU and memory intensive, so
> I advice running it on a computing cluster instead of using regular
> desktop/notebook machine.
>
> Regarding your function call, when running actual analysis (not just
> testing/trying out), please use higher values for MCMC_burnIn,
> MCMC_samplesN and MCMC_samplesSave (the default when leaving these blank
> is 1000 and is usually "good enough"), the computation will take longer,
> however the estimates will be much more accurate as well.
> (The values 200, 200, 50 are used in the vignette because the example
> data is very small, and the vignette has to run within time limit.)
>
> Also, for future reference when you have questions regarding
> Bioconductor packages, please post to Bioconductor user mailing list
> (and CC package author), as you might sometimes get replies from other
> users and also your post might help some other users if they encounter
> similar problem in the future.
>
> Best regards,
> Peter.
>
> On 21/03/13 21:46, Maayan Kreitzman wrote:
>> Dear Peter,
>> I'm trying to run BitSeq, and am running into a problem after several hours of the getExpression function running.
>> This same thing happened twice, on different servers. What weird is that not only does the function crash, it actually exits R.
>> this is the error message:
>>
>> terminate called after throwing an instance of 'std::bad_alloc'
>>     what():  St9bad_alloc
>> Aborted
>>
>> I have no experience whatsoever with R, so this may be a novice mistake, but your help would be greatly appreciated.
>> I've copied the whole session below.
>>
>> thanks in advance,
>> Maayan
>>
>>
>>> library("BitSeq")
>> Loading required package: Rsamtools
>> Loading required package: IRanges
>> Loading required package: BiocGenerics
>>
>> Attaching package: ‘BiocGenerics’
>>
>> The following object(s) are masked from ‘package:stats’:
>>
>>       xtabs
>>
>> The following object(s) are masked from ‘package:base’:
>>
>>       anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
>>       get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
>>       pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
>>       rownames, sapply, setdiff, table, tapply, union, unique
>>
>> Loading required package: GenomicRanges
>> Loading required package: Biostrings
>> Loading required package: zlibbioc
>>> res1 <- getExpression("/projects/mkreitzman_prj/expression_quantification_testing/testing/test_data/strand_specific/transcriptome/bowtie2transcriptome/A08473_gr4.sam",
>> + "/projects/mkreitzman_prj/expression_quantification_testing/testing/test_data/strand_specific/transcriptome/Homo_sapiens.GRCh37.69.cdna.all.fa",
>> + log = TRUE, MCMC_burnIn=200,MCMC_samplesN=200,MCMC_samplesSave=50,seed=47)
>> ## Computing alignment probabilities.
>> [time: +1.866667 m]
>> [time: +36.400000 m]
>> [time: +0.000000 m]
>> [time: +117.050000 m]
>> [time: +0.500000 m]
>> [time: +0.000000 m]
>> ## Estimating transcript expression levels.
>> Mappings: 71092830
>> Ntotal: 123098679
>>     10000 [time: +1.000000 s]
>>     100000 [time: +0.000000 s]
>>     1000000 [time: +3.000000 s]
>>     10000000 [time: +25.000000 s]
>> Read only 14186178 reads.
>> Finished Reading!
>> Total hits = 28372355
>> Isoforms: 183985
>> Burn in: 200 DONE. [time: +12.016667 m]
>>
>> Sampling DONE. [time: +12.850000 m]
>> rHat (for 200 samples)
>>      rHat (rHat from subset |    tid | mean theta)
>>       1.0252 ( 1.1173 |  89080 |  0.0000)
>>       1.0216 ( 1.1351 | 126802 |  0.0000)
>>       1.0183 ( 1.0151 | 183201 |  0.0000)
>>     Mean rHat of worst 10 transcripts: 1.018596
>>     Mean C0: (3516 3520 3529 3518 ). Nunmap: 52005849
>>
>> Producing 33 final samples.
>>
>> Sampling DONE. [time: +2.166667 m]
>> rHat (for 33 samples)
>>      rHat (rHat from subset |    tid | mean theta)
>>       1.1193 ( 1.1332 | 117458 |  0.0000)
>>       1.1181 ( 1.1229 | 158878 |  0.0000)
>>       1.1074 ( 1.1004 |  43840 |  0.0000)
>>     Mean rHat of worst 10 transcripts: 1.108279
>>     Mean C0: (3528 3512 3523 3520 ). Nunmap: 52005849
>>
>> Total samples: 932
>> terminate called after throwing an instance of 'std::bad_alloc'
>>     what():  St9bad_alloc
>> Aborted



More information about the Bioconductor mailing list