[Bioc-sig-seq] extendReads() error in chipseq package

Thu Jun 4 02:14:28 CEST 2009

Hi Rebecca --

Rebecca Sun wrote:
> Hi all,
> 
> I am in the process of analyzing  ChIP-seq data by using chipseq package.
> There was an error happen when I use extendRead() to get the longer fragment
> for alignedReads object.
> 
> ###################
>> library(ShortRead)
>> library(chipseq)
>> aln <-readAligned(".","flowcell.bowtie",type="Bowtie")
>> unique <-aln[!srduplicated(aln)]
>> ext <- extendReads(unique, seqLen = 200)
> *
> Error in s1[[ipos]] : recursive indexing failed at level 2*
> ###################

I don't have a fix for this bug, but...

> *
> I found extendReads() is works when I used it for one AlignedRead:*
> 
>> ext <-extendReads(unique[1],seqLen=200)
>> ext
> $`gi|29823167|ref|NT_010966.13|Hs18_11123`
> IRanges instance:
>      start    end width
> [1] 307249 307448   200
> [2] 307089 307288   200
> 
> *And also, I tried to convert aln to GenomeData, then used extendread(),no
> error, but the return value is not IRanges object*
>> unique_genome <-as(unique,"GenomeData")
>> ext1<-extendReads(unique_alngenome,seqLen=200)
>> ext1
> A GenomeData instance
> chromosomes(6): gi|29823167|ref|NT_010966.13|Hs18_11123 ...

The members of ext1 are IRanges objects, e.g., ext1[[1]] is an IRanges
object

> example(readAligned)
> aln=readAligned(sp, "s_2_export.txt", filter=filt)
> gd <- extendReads(as(aln, "GenomeData"), 200)
> gd
A GenomeData instance
chromosomes(5): chr1.fa chr2.fa chr3.fa chr4.fa chr5.fa

> gd[[1]]
IRanges instance:
         start       end width
[1]    3393025   3393224   200
[2]  171119303 171119502   200

[SNIP]

> gd[["chr1.fa"]]
IRanges instance:
         start       end width
[1]    3393025   3393224   200
[2]  171119303 171119502   200

[SNIP]

My sessionInfo() is

> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-06-02 r48703)
x86_64-unknown-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] chipseq_0.1.20     ShortRead_1.3.10   lattice_0.17-25    BSgenome_1.13.6
[5] Biostrings_2.13.12 IRanges_1.3.21

loaded via a namespace (and not attached):
[1] Biobase_2.5.2 grid_2.10.0   hwriter_1.1

what's yours?

Martin

> Any suggestion? Thanks.
> 
> Rebecca
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing