[Bioc-devel] Modification of seqnames in GRanges

Hervé Pagès hpages at fhcrc.org
Tue Nov 1 20:52:42 CET 2011


Hi,

You can also do:

   seqlevels(gr1) <- sub("chr", "", seqlevels(gr1))

This works on GRanges, GRangesList and GappedAlignments objects, but
not on TranscriptDb or BSgenome objects where the seqlevels are
immutable (that could be changed though, depends how much need there
is for it).

Cheers,
H.


On 11-11-01 12:34 PM, Cook, Malcolm wrote:
> Thanks for the pointer.
>
> I think those are only in 2.9.... to be released any day now....
>
> Cheers,
>
> ~Malcolm
>
>
>> -----Original Message-----
>> From: Valerie Obenchain [mailto:vobencha at fhcrc.org]
>> Sent: Friday, October 28, 2011 6:13 PM
>> To: Liu,Bin
>> Cc: Cook, Malcolm; 'bioc-devel at r-project.org'
>> Subject: Re: [Bioc-devel] Modification of seqnames in GRanges
>>
>> Liu and Bin,
>>
>> For more general use you might be interested in.
>>
>> ?keepSeqlevels
>> ?renameSeqlevels
>>
>> in the GenomicRanges package.
>>
>> Valerie
>>
>>
>>
>> On 10/28/2011 02:26 PM, Liu,Bin wrote:
>>> Thanks a lot! It works!
>>>
>>> Bin
>>>
>>>
>>> -----Original Message-----
>>> From: Cook, Malcolm [mailto:MEC at stowers.org]
>>> Sent: Friday, October 28, 2011 1:57 PM
>>> To: Liu,Bin; 'bioc-devel at r-project.org'
>>> Subject: RE: Modification of seqnames in GRanges
>>>
>>> Hi Bin,
>>>
>>> I think I have you covered....
>>>
>>>
>>> UCSC2FlybaseGRanges<- function (GRanges) {
>>> ### Rename the chromosomes in<GRanges>   from UCSC conventions
>> ('chr1',
>>> ### etc) to comport with Flybase conventions ('1', etc) by stripping
>>> ### leading 'chr' and translating 'M' as 'dmel_mitochondrion_genome'.
>>>     ## old way:?
>>>     ## levels(seqnames(BSgenome))<-
>> factor(Rle(sub('chr','',levels(seqnames(BSgenome)))))
>>>     ##
>> levels(seqnames(BSgenome))[levels(seqnames(BSgenome))=="M"]<-
>> "dmel_mitochondrion_genome"
>>>     seqlevels(GRanges)<- sub('chr','',seqlevels(GRanges))
>>>     seqlevels(GRanges)<-
>> sub('M','dmel_mitochondrion_genome',seqlevels(GRanges))
>>>     GRanges
>>> }
>>>
>>> Flybase2UCSC_ID<-function(ID)
>> gsub('^(.*)$','chr\\1',sub('dmel_mitochondrion_genome','M',ID))
>>>
>>> Flybase2UCSCGRanges<- function (GRanges) {
>>> ### converse of UCSC2FlybaseGRanges()
>>>     seqlevels(GRanges)<-
>> sub('dmel_mitochondrion_genome','M',seqlevels(GRanges))
>>>     seqlevels(GRanges)<- gsub('^(.*)$','chr\\1',seqlevels(GRanges))
>>>     GRanges
>>> }
>>>
>>> Regards,
>>>
>>> ~Malcolm
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioc-devel-bounces at r-project.org [mailto:bioc-devel-bounces at r-
>>>> project.org] On Behalf Of Liu,Bin
>>>> Sent: Friday, October 28, 2011 11:56 AM
>>>> To: bioc-devel at r-project.org
>>>> Subject: [Bioc-devel] Modification of seqnames in GRanges
>>>>
>>>> Hi,
>>>>       I have a question about how to modify the seqnames in below. I would
>> like
>>>> to remove chr in the seqnames.
>>>> For example: "chr2L" ->   "2L". I tried the following code and got an error:
>>>>
>>>>> seqnames(gr1)<- sub("chr", "", seqnames(gr1))
>>>> Error in `seqnames<-`(`*tmp*`, value =<S4 object of class "Rle">) :
>>>>     levels of supplied 'seqnames' must be identical to 'seqlevels(x)'
>>>>
>>>>
>>>>     With the following checking, it seems chrM is not in seqnames but is in
>> the
>>>> levels.
>>>>> levels(seqnames(gr1))
>>>>    [1] "chr2L"     "chr2LHet"  "chr2R"     "chr2RHet"  "chr3L"     "chr3LHet"
>>>>    [7] "chr3R"     "chr3RHet"  "chr4"      "chrU"      "chrUextra" "chrX"
>>>> [13] "chrXHet"   "chrYHet"   "chrM"
>>>>
>>>>> unique(seqnames(gr1))
>>>>    [1] chr2L     chr2LHet  chr2R     chr2RHet  chr3L     chr3LHet  chr3R
>>>>    [8] chr3RHet  chr4      chrU      chrUextra chrX      chrXHet   chrYHet
>>>> 15 Levels: chr2L chr2LHet chr2R chr2RHet chr3L chr3LHet chr3R chr3RHet ...
>>>> chrM
>>>>
>>>>      Any way to solve the problem?   Thanks for the help.
>>>>
>>>> Bin Liu
>>>>
>>>>
>>>>> gr1
>>>> GRanges with 48498 ranges and 0 elementMetadata values:
>>>>             seqnames           ranges strand
>>>>                <Rle>          <IRanges>    <Rle>
>>>>         [1]    chr2L   [ 8117,  8192]      +
>>>>         [2]    chr2L   [11345, 11409]      +
>>>>         [3]    chr2L   [11519, 11778]      +
>>>>         [4]    chr2L   [12222, 12285]      +
>>>>         [5]    chr2L   [12929, 13519]      +
>>>>         [6]    chr2L   [13626, 13682]      +
>>>>         [7]    chr2L   [14875, 14932]      +
>>>>         [8]    chr2L   [15712, 17052]      +
>>>>         [9]    chr2L   [17213, 18025]      +
>>>>         ...      ...              ...    ...
>>>>     [48490]  chrYHet [191815, 191893]      +
>>>>     [48491]  chrYHet [192066, 206119]      +
>>>>     [48492]  chrYHet [229343, 232315]      +
>>>>     [48493]  chrYHet [232496, 232758]      +
>>>>     [48494]  chrYHet [233403, 233455]      +
>>>>     [48495]  chrYHet [233595, 271120]      +
>>>>     [48496]  chrYHet [280371, 291285]      +
>>>>     [48497]  chrYHet [305380, 305531]      +
>>>>     [48498]  chrYHet [306016, 306486]      +
>>>>     ---
>>>>     seqlengths:
>>>>          chr2L  chr2LHet     chr2R  chr2RHet ...   chrXHet   chrYHet      chrM
>>>>       23011544    368872  21146708   3288761 ...    204112    347038     19517
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel


-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list