

Do you need all the sequence data at once? 

Instead of using a smaller bam file can you read in a smaller portion of
your large bamfile ?

data.gr<-GRanges(seqnames =paste("chr",13,sep=""),ranges =
IRanges(start=as.numeric(28608234),end=as.numeric(28608363)),strand="+")


which<-  data.gr
params<-ScanBamParam(which=which,flag=scanBamFlag(isUnmappedQuery=FALSE,isDuplicate=NA,isValidVendorRead=TRUE),simpleCigar = FALSE,reverseComplement = FALSE,what=c("qname","flag","rname","seq","strand","pos","mpos","qwidth","cigar","qual","mapq","isize", "mrnm" ),tag="RG" ) # change to what you want
aln1 <- scanBam("HS1808.bam",param=params)

aln1[[1]]

That should work fine?




-- 
Dr Paul Leo
Bioinformatician
UQ Diamantina Institute for Cancer, Immunology and Metabolic Medicine 
---------------------------------------------------------------------
Level 4, R Wing 
Princess Alexandra Hospital 
Ipswich Rd 
Woolloongabba QLD 4102 
Tel: +61 7 3240 7740  Mob: 041 303 8691  Fax: +61 7 3240 5946 
Email: p.leo@uq.edu.au   Web: http://www.di.uq.edu.au




-----Original Message-----
From: Dario Strbenac <D.Strbenac@garvan.org.au>
Reply-to: D.Strbenac@garvan.org.au
To: bioc-sig-sequencing@r-project.org
Subject: Re: [Bioc-sig-seq] scanBam Error
Date: Mon, 13 Dec 2010 17:15:38 +1100


I tried it out by making a smaller bam file with only reads from one chromosome, and it worked fine. The full bam file is 4 GB and has 75 million reads in it. Could the size be a problem ? Could you test out a bam file of this size on your end, without me sending you one that big ? Also, the error is different after I put the scamBamParam in the right spot :

Error in .Call(func, file, index, "rb", NULL, flag, simpleCigar, ...) : 
  negative length vectors are not allowed

Integer overflow somewhere, maybe ?

- Dario.

---- Original message ----
>Date: Sun, 12 Dec 2010 20:59:23 -0800
>From: Martin Morgan <mtmorgan@fhcrc.org>  
>Subject: Re: [Bioc-sig-seq] scanBam Error  
>To: D.Strbenac@garvan.org.au
>Cc: bioc-sig-sequencing@r-project.org
>
>On 12/12/2010 08:00 PM, Dario Strbenac wrote:
>> Hello,
>> 
>
>> I'm having trouble reading in a BAM file when "seq" is one of the
>strings passed to the what argument of ScanBamParam. If it's not, then
>the the reading completes successfully. I don't understand what the
>error means. It is :
>> 
>> Error in .io_bam(.scan_bam, file, index, reverseComplement, tmpl, param = param) : 
>>   INTEGER() can only be applied to a 'integer', not a 'closure'
>> 
>> The traceback is :
>> 
>>> traceback()
>> 4: .Call(func, file, index, "rb", NULL, flag, simpleCigar, ...)
>> 3: .io_bam(.scan_bam, file, index, reverseComplement, tmpl, param = param)
>> 2: scanBam("HS1808.bam", flag = ScanBamFlag(isDuplicate = FALSE), 
>>        param = ScanBamParam(reverseComplement = TRUE, what = c("rname", 
>>            "strand", "pos", "seq")))
>> 1: scanBam("HS1808.bam", flag = ScanBamFlag(isDuplicate = FALSE), 
>>        param = ScanBamParam(reverseComplement = TRUE, what = c("rname", 
>>            "strand", "pos", "seq")))
>> 
>> and the environment is :
>> 
>> R version 2.12.0 (2010-10-15)
>> Platform: x86_64-pc-mingw32/x64 (64-bit)
>> 
>> locale:
>> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                       LC_TIME=English_Australia.1252    
>> 
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods   base     
>> 
>> other attached packages:
>> [1] Rsamtools_1.2.1     Biostrings_2.18.0   GenomicRanges_1.2.0 IRanges_1.8.2      
>> 
>> loaded via a namespace (and not attached):
>> [1] Biobase_2.8.0
>
>Hi Dario -- this is some kind of error in Rsamtools' C code, but I'm not
>able to reproduce it on my end so can't track it down. Is there any way
>of producing and sharing with me an example file that has this problem?
>
>One thing (not causing the bug) in your traceback is that 'flag' should
>be an argument to ScanBamParam; as it is I think it is being silently
>ignored.
>
>Martin
>
>> 
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>> 
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>-- 
>Computational Biology
>Fred Hutchinson Cancer Research Center
>1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
>
>Location: M1-B861
>Telephone: 206 667-2793

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing@r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

	[[alternative HTML version deleted]]

