[BioC] Merging RandData object with names on the IRanges part
Patrick Aboyoun
paboyoun at fhcrc.org
Thu Aug 20 19:13:22 CEST 2009
Ulrike,
First of all, I'm glad IRanges is useful for you.
Second, thanks for finding a bug in the rbind method for RangedData
objects. Because of developer oversight, the duplicate names in the
ranges was being handled differently than the duplicate rownames in the
values. This has been corrected in a recent svn check-in to IRanges in
the BioC 2.5 code line. You can get this updated IRanges package
(version 1.3.58) either through svn access or wait 24-48 hours for the
updated IRanges package to be placed on bioconductor.org and
downloadable via biocLite.
> suppressMessages(library(IRanges))
> t1 <- RangedData(IRanges(start=c(7828367, 7828552,4121953),
end=c(7828402, 7828587, 4121988)), space=c("Chr1", "Chr1", "Chr3"),
mapq=c(1,2,1),flag=c(3,4,5))
> rbind(t1, t1)
RangedData: 6 ranges by 2 columns on 2 sequences
colnames(2): mapq flag
names(2): Chr1 Chr3
> t2 <- RangedData(IRanges(start=c(7828367, 7828552,4121953),
end=c(7828402, 7828587, 4121988), names=c("a", "b", "c")),
space=c("Chr1", "Chr1", "Chr3"), mapq=c(1,2,1),flag=c(3,4,5))
> rbind(t2, t2)
RangedData: 6 ranges by 2 columns on 2 sequences
colnames(2): mapq flag
names(2): Chr1 Chr3
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-08-05 r49073)
i386-apple-darwin9.7.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] IRanges_1.3.58
Patrick
Ulrike Goebel wrote:
> Dear list,
>
> I would like to do the following:
> Read an output file of BWA (SAM format) in "chunks" and incrementally
> build a RangedData object from
> the chunks (by 'rbind') . Ultimately that should be used to get the
> number of reads per annotated transcript/region, but this is not the
> question here.
>
> Assume as an example:
> t1 <- RangedData(IRanges(start=c(7828367, 7828552,4121953),
> end=c(7828402, 7828587, 4121988)), space=c("Chr1", "Chr1", "Chr3"),
> mapq=c(1,2,1),flag=c(3,4,5))
>
> I can merge two copies of this by 'rbind(t1,t1)'.
>
> But:
> t2 <- RangedData(IRanges(start=c(7828367, 7828552,4121953),
> end=c(7828402, 7828587, 4121988), names=c("a", "b", "c")),
> space=c("Chr1", "Chr1", "Chr3"), mapq=c(1,2,1),flag=c(3,4,5))
> (Here, I would like to keep the read names along with their positions
> in the IRanges object).
>
> > rbind(t2,t2)
> Error in validObject(.Object) :
> invalid class "RangedData" object: the names of the ranges must equal
> the rownames
>
> Am I doing something completely wrong here ? Or is it confusing two
> different meanings of 'names' ?
>
>
> BTW, I really like IRanges !
>
> Ulrike
> > sessionInfo()
> R version 2.10.0 Under development (unstable) (2009-08-01 r49053)
> x86_64-unknown-linux-gnu
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] grid stats graphics grDevices utils datasets methods
> [8] base
>
> other attached packages:
> [1] ChIPR_1.1.3 MASS_7.3-0 spatstat_1.16-1
> [4] deldir_0.0-8 gpclib_1.4-4 mgcv_1.5-5
> [7] convert_1.21.1 marray_1.23.0 matchprobes_1.17.0
> [10] AnnotationDbi_1.7.11 Biostrings_2.13.29 TeachingDemos_2.4
> [13] Ringo_1.9.8 Matrix_0.999375-30 lattice_0.17-25
> [16] limma_2.19.2 RColorBrewer_1.0-2 Biobase_2.5.5
> [19] IRanges_1.3.56
>
> loaded via a namespace (and not attached):
> [1] affy_1.23.4 affyio_1.13.3 annotate_1.23.1
> [4] DBI_0.2-4 genefilter_1.25.7 nlme_3.1-92
> [7] preprocessCore_1.7.4 RSQLite_0.7-1 splines_2.10.0
> [10] survival_2.35-4 tools_2.10.0 xtable_1.5-5
>
>
>
>
More information about the Bioconductor
mailing list