[Bioc-devel] Modification of seqnames in GRanges

Liu,Bin BLiu1 at mdanderson.org
Fri Oct 28 23:26:01 CEST 2011


Thanks a lot! It works!

Bin


-----Original Message-----
From: Cook, Malcolm [mailto:MEC at stowers.org] 
Sent: Friday, October 28, 2011 1:57 PM
To: Liu,Bin; 'bioc-devel at r-project.org'
Subject: RE: Modification of seqnames in GRanges

Hi Bin,

I think I have you covered....


UCSC2FlybaseGRanges <- function (GRanges) {
### Rename the chromosomes in <GRanges> from UCSC conventions ('chr1',
### etc) to comport with Flybase conventions ('1', etc) by stripping
### leading 'chr' and translating 'M' as 'dmel_mitochondrion_genome'.
  ## old way:?
  ## levels(seqnames(BSgenome))<-factor(Rle(sub('chr','',levels(seqnames(BSgenome)))))
  ## levels(seqnames(BSgenome))[levels(seqnames(BSgenome))=="M"]<-"dmel_mitochondrion_genome"
  seqlevels(GRanges) <- sub('chr','',seqlevels(GRanges))
  seqlevels(GRanges) <- sub('M','dmel_mitochondrion_genome',seqlevels(GRanges))
  GRanges
}

Flybase2UCSC_ID <-function(ID)  gsub('^(.*)$','chr\\1',sub('dmel_mitochondrion_genome','M',ID))

Flybase2UCSCGRanges <- function (GRanges) {
### converse of UCSC2FlybaseGRanges()
  seqlevels(GRanges) <- sub('dmel_mitochondrion_genome','M',seqlevels(GRanges))
  seqlevels(GRanges) <- gsub('^(.*)$','chr\\1',seqlevels(GRanges))
  GRanges
}

Regards,

~Malcolm


> -----Original Message-----
> From: bioc-devel-bounces at r-project.org [mailto:bioc-devel-bounces at r-
> project.org] On Behalf Of Liu,Bin
> Sent: Friday, October 28, 2011 11:56 AM
> To: bioc-devel at r-project.org
> Subject: [Bioc-devel] Modification of seqnames in GRanges
> 
> Hi,
>     I have a question about how to modify the seqnames in below. I would like
> to remove chr in the seqnames.
> For example: "chr2L" -> "2L". I tried the following code and got an error:
> 
> > seqnames(gr1) <- sub("chr", "", seqnames(gr1))
> 
> Error in `seqnames<-`(`*tmp*`, value = <S4 object of class "Rle">) :
>   levels of supplied 'seqnames' must be identical to 'seqlevels(x)'
> 
> 
>   With the following checking, it seems chrM is not in seqnames but is in the
> levels.
> > levels(seqnames(gr1))
>  [1] "chr2L"     "chr2LHet"  "chr2R"     "chr2RHet"  "chr3L"     "chr3LHet"
>  [7] "chr3R"     "chr3RHet"  "chr4"      "chrU"      "chrUextra" "chrX"
> [13] "chrXHet"   "chrYHet"   "chrM"
> 
> > unique(seqnames(gr1))
>  [1] chr2L     chr2LHet  chr2R     chr2RHet  chr3L     chr3LHet  chr3R
>  [8] chr3RHet  chr4      chrU      chrUextra chrX      chrXHet   chrYHet
> 15 Levels: chr2L chr2LHet chr2R chr2RHet chr3L chr3LHet chr3R chr3RHet ...
> chrM
> 
>    Any way to solve the problem?   Thanks for the help.
> 
> Bin Liu
> 
> 
> > gr1
> GRanges with 48498 ranges and 0 elementMetadata values:
>           seqnames           ranges strand
>              <Rle>        <IRanges>  <Rle>
>       [1]    chr2L   [ 8117,  8192]      +
>       [2]    chr2L   [11345, 11409]      +
>       [3]    chr2L   [11519, 11778]      +
>       [4]    chr2L   [12222, 12285]      +
>       [5]    chr2L   [12929, 13519]      +
>       [6]    chr2L   [13626, 13682]      +
>       [7]    chr2L   [14875, 14932]      +
>       [8]    chr2L   [15712, 17052]      +
>       [9]    chr2L   [17213, 18025]      +
>       ...      ...              ...    ...
>   [48490]  chrYHet [191815, 191893]      +
>   [48491]  chrYHet [192066, 206119]      +
>   [48492]  chrYHet [229343, 232315]      +
>   [48493]  chrYHet [232496, 232758]      +
>   [48494]  chrYHet [233403, 233455]      +
>   [48495]  chrYHet [233595, 271120]      +
>   [48496]  chrYHet [280371, 291285]      +
>   [48497]  chrYHet [305380, 305531]      +
>   [48498]  chrYHet [306016, 306486]      +
>   ---
>   seqlengths:
>        chr2L  chr2LHet     chr2R  chr2RHet ...   chrXHet   chrYHet      chrM
>     23011544    368872  21146708   3288761 ...    204112    347038     19517
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list