[Bioc-devel] Modification of seqnames in GRanges
Hervé Pagès
hpages at fhcrc.org
Tue Nov 1 20:52:42 CET 2011
Hi,
You can also do:
seqlevels(gr1) <- sub("chr", "", seqlevels(gr1))
This works on GRanges, GRangesList and GappedAlignments objects, but
not on TranscriptDb or BSgenome objects where the seqlevels are
immutable (that could be changed though, depends how much need there
is for it).
Cheers,
H.
On 11-11-01 12:34 PM, Cook, Malcolm wrote:
> Thanks for the pointer.
>
> I think those are only in 2.9.... to be released any day now....
>
> Cheers,
>
> ~Malcolm
>
>
>> -----Original Message-----
>> From: Valerie Obenchain [mailto:vobencha at fhcrc.org]
>> Sent: Friday, October 28, 2011 6:13 PM
>> To: Liu,Bin
>> Cc: Cook, Malcolm; 'bioc-devel at r-project.org'
>> Subject: Re: [Bioc-devel] Modification of seqnames in GRanges
>>
>> Liu and Bin,
>>
>> For more general use you might be interested in.
>>
>> ?keepSeqlevels
>> ?renameSeqlevels
>>
>> in the GenomicRanges package.
>>
>> Valerie
>>
>>
>>
>> On 10/28/2011 02:26 PM, Liu,Bin wrote:
>>> Thanks a lot! It works!
>>>
>>> Bin
>>>
>>>
>>> -----Original Message-----
>>> From: Cook, Malcolm [mailto:MEC at stowers.org]
>>> Sent: Friday, October 28, 2011 1:57 PM
>>> To: Liu,Bin; 'bioc-devel at r-project.org'
>>> Subject: RE: Modification of seqnames in GRanges
>>>
>>> Hi Bin,
>>>
>>> I think I have you covered....
>>>
>>>
>>> UCSC2FlybaseGRanges<- function (GRanges) {
>>> ### Rename the chromosomes in<GRanges> from UCSC conventions
>> ('chr1',
>>> ### etc) to comport with Flybase conventions ('1', etc) by stripping
>>> ### leading 'chr' and translating 'M' as 'dmel_mitochondrion_genome'.
>>> ## old way:?
>>> ## levels(seqnames(BSgenome))<-
>> factor(Rle(sub('chr','',levels(seqnames(BSgenome)))))
>>> ##
>> levels(seqnames(BSgenome))[levels(seqnames(BSgenome))=="M"]<-
>> "dmel_mitochondrion_genome"
>>> seqlevels(GRanges)<- sub('chr','',seqlevels(GRanges))
>>> seqlevels(GRanges)<-
>> sub('M','dmel_mitochondrion_genome',seqlevels(GRanges))
>>> GRanges
>>> }
>>>
>>> Flybase2UCSC_ID<-function(ID)
>> gsub('^(.*)$','chr\\1',sub('dmel_mitochondrion_genome','M',ID))
>>>
>>> Flybase2UCSCGRanges<- function (GRanges) {
>>> ### converse of UCSC2FlybaseGRanges()
>>> seqlevels(GRanges)<-
>> sub('dmel_mitochondrion_genome','M',seqlevels(GRanges))
>>> seqlevels(GRanges)<- gsub('^(.*)$','chr\\1',seqlevels(GRanges))
>>> GRanges
>>> }
>>>
>>> Regards,
>>>
>>> ~Malcolm
>>>
>>>
>>>> -----Original Message-----
>>>> From: bioc-devel-bounces at r-project.org [mailto:bioc-devel-bounces at r-
>>>> project.org] On Behalf Of Liu,Bin
>>>> Sent: Friday, October 28, 2011 11:56 AM
>>>> To: bioc-devel at r-project.org
>>>> Subject: [Bioc-devel] Modification of seqnames in GRanges
>>>>
>>>> Hi,
>>>> I have a question about how to modify the seqnames in below. I would
>> like
>>>> to remove chr in the seqnames.
>>>> For example: "chr2L" -> "2L". I tried the following code and got an error:
>>>>
>>>>> seqnames(gr1)<- sub("chr", "", seqnames(gr1))
>>>> Error in `seqnames<-`(`*tmp*`, value =<S4 object of class "Rle">) :
>>>> levels of supplied 'seqnames' must be identical to 'seqlevels(x)'
>>>>
>>>>
>>>> With the following checking, it seems chrM is not in seqnames but is in
>> the
>>>> levels.
>>>>> levels(seqnames(gr1))
>>>> [1] "chr2L" "chr2LHet" "chr2R" "chr2RHet" "chr3L" "chr3LHet"
>>>> [7] "chr3R" "chr3RHet" "chr4" "chrU" "chrUextra" "chrX"
>>>> [13] "chrXHet" "chrYHet" "chrM"
>>>>
>>>>> unique(seqnames(gr1))
>>>> [1] chr2L chr2LHet chr2R chr2RHet chr3L chr3LHet chr3R
>>>> [8] chr3RHet chr4 chrU chrUextra chrX chrXHet chrYHet
>>>> 15 Levels: chr2L chr2LHet chr2R chr2RHet chr3L chr3LHet chr3R chr3RHet ...
>>>> chrM
>>>>
>>>> Any way to solve the problem? Thanks for the help.
>>>>
>>>> Bin Liu
>>>>
>>>>
>>>>> gr1
>>>> GRanges with 48498 ranges and 0 elementMetadata values:
>>>> seqnames ranges strand
>>>> <Rle> <IRanges> <Rle>
>>>> [1] chr2L [ 8117, 8192] +
>>>> [2] chr2L [11345, 11409] +
>>>> [3] chr2L [11519, 11778] +
>>>> [4] chr2L [12222, 12285] +
>>>> [5] chr2L [12929, 13519] +
>>>> [6] chr2L [13626, 13682] +
>>>> [7] chr2L [14875, 14932] +
>>>> [8] chr2L [15712, 17052] +
>>>> [9] chr2L [17213, 18025] +
>>>> ... ... ... ...
>>>> [48490] chrYHet [191815, 191893] +
>>>> [48491] chrYHet [192066, 206119] +
>>>> [48492] chrYHet [229343, 232315] +
>>>> [48493] chrYHet [232496, 232758] +
>>>> [48494] chrYHet [233403, 233455] +
>>>> [48495] chrYHet [233595, 271120] +
>>>> [48496] chrYHet [280371, 291285] +
>>>> [48497] chrYHet [305380, 305531] +
>>>> [48498] chrYHet [306016, 306486] +
>>>> ---
>>>> seqlengths:
>>>> chr2L chr2LHet chr2R chr2RHet ... chrXHet chrYHet chrM
>>>> 23011544 368872 21146708 3288761 ... 204112 347038 19517
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list