[BioC] Order within a GRanges object
Cook, Malcolm
MEC at stowers.org
Tue Aug 20 15:05:35 CEST 2013
>Hello,
>
>I have some points according to the internal order of granges objects.
>
>1) Automatically there is an order depending on the a) seqnames (=
>chromosomes) and b) the ranges.
no! There is no gaurantee on the order.
> library(GenomicRanges)
> example(GRanges)
...
> longGR
GRanges with 30 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <integer>
a chr1 [1, 10] - | 1
b chr2 [2, 10] + | 2
c chr2 [3, 10] + | 3
d chr2 [4, 10] * | 4
e chr1 [5, 10] * | 5
... ... ... ... ... ...
chr2 [106, 115] - | 26
chr2 [107, 116] - | 27
chr3 [108, 117] - | 28
chr3 [109, 118] - | 29
chr3 [110, 119] - | 30
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500
> rev(longGR)
GRanges with 30 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <integer>
chr3 [110, 119] - | 30
chr3 [109, 118] - | 29
chr3 [108, 117] - | 28
chr2 [107, 116] - | 27
chr2 [106, 115] - | 26
... ... ... ... ... ...
e chr1 [5, 10] * | 5
d chr2 [4, 10] * | 4
c chr2 [3, 10] + | 3
b chr2 [2, 10] + | 2
a chr1 [1, 10] - | 1
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500
>
>
>2) The seqnames are always sorted in ascii order.
No! but they _can_ be:
> sort(longGR)
GRanges with 30 ranges and 1 metadata column:
seqnames ranges strand | score
<Rle> <IRanges> <Rle> | <integer>
f chr1 [6, 10] + | 6
chr1 [1, 5] - | 101
a chr1 [1, 10] - | 1
chr1 [2, 6] - | 102
chr1 [3, 7] - | 103
... ... ... ... ... ...
j chr3 [ 10, 10] - | 10
chr3 [ 10, 14] - | 110
chr3 [108, 117] - | 28
chr3 [109, 118] - | 29
chr3 [110, 119] - | 30
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500
~ Malcolm Cook
>
>3) After
> df <- as.data.frame
> m <- regexpr ("\\d+", df$seqnames, perl=TRUE)
> df$Chromosome <- regmatches (df$seqnames, m)
> df$Chromosome <- as.integer (as.character (df$Chromosome))
> df <- df [order(df$Chromosome),]
> only the order of the chromosomes is changed. The order of the ranges
>(now df$start and df$end) is still the same.
>
>Are my assumptions true?
>
>Thanks Hermann
>
> [[alternative HTML version deleted]]
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioconductor
>Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list