[Bioc-devel] Bioconductor 3.19 db0s, OrgDbs, and TxDbs now available

James W. MacDonald jm@cdon @end|ng |rom uw@edu
Thu Mar 28 15:12:24 CET 2024


As well as

> subsetByOverlaps(transcripts(Homo.sapiens), GRanges("chrM:1-16569"))
'select()' returned 1:1 mapping between keys and columns
GRanges object with 37 ranges and 2 metadata columns:
       seqnames      ranges strand |          TXID            TXNAME
          <Rle>   <IRanges>  <Rle> | <IntegerList>   <CharacterList>
   [1]     chrM     577-647      + |        252799 ENST00000387314.1
   [2]     chrM    648-1601      + |        252800 ENST00000389680.2
   [3]     chrM   1602-1670      + |        252801 ENST00000387342.1
   [4]     chrM   1671-3229      + |        252802 ENST00000387347.2
   [5]     chrM   3230-3304      + |        252803 ENST00000386347.1
   ...      ...         ...    ... .           ...               ...
  [33]     chrM   5826-5891      - |        252831 ENST00000387409.1
  [34]     chrM   7446-7514      - |        252832 ENST00000387416.2
  [35]     chrM 14149-14673      - |        252833 ENST00000361681.2
  [36]     chrM 14674-14742      - |        252834 ENST00000387459.1
  [37]     chrM 15956-16023      - |        252835 ENST00000387461.2
  -------
  seqinfo: 711 sequences (1 circular) from hg38 genome

However

> subsetByOverlaps(transcriptsBy(Homo.sapiens), GRanges("chrM:1-16569"))
GRangesList object of length 0:
<0 elements>

And

> subsetByOverlaps(transcripts(Homo.sapiens, columns = c("GENEID","SYMBOL")), GRanges("chrM:1-16569"))
'select()' returned 1:1 mapping between keys and columns
GRanges object with 37 ranges and 2 metadata columns:
       seqnames      ranges strand |          GENEID          SYMBOL
          <Rle>   <IRanges>  <Rle> | <CharacterList> <CharacterList>
   [1]     chrM     577-647      + |            <NA>            <NA>
   [2]     chrM    648-1601      + |            <NA>            <NA>
   [3]     chrM   1602-1670      + |            <NA>            <NA>
   [4]     chrM   1671-3229      + |            <NA>            <NA>
   [5]     chrM   3230-3304      + |            <NA>            <NA>
   ...      ...         ...    ... .             ...             ...
  [33]     chrM   5826-5891      - |            <NA>            <NA>
  [34]     chrM   7446-7514      - |            <NA>            <NA>
  [35]     chrM 14149-14673      - |            <NA>            <NA>
  [36]     chrM 14674-14742      - |            <NA>            <NA>
  [37]     chrM 15956-16023      - |            <NA>            <NA>
  -------
  seqinfo: 711 sequences (1 circular) from hg38 genome

Everything is mapped via the GENEID, and if you query the UCSC genome browser for hg38/knownGene, asking for gene name, known gene ID and gene symbol, you will get the first and last but not the middle. 



-----Original Message-----
From: Bioc-devel <bioc-devel-bounces using r-project.org> On Behalf Of Vincent Carey
Sent: Thursday, March 28, 2024 10:00 AM
To: Tim Triche, Jr. <tim.triche using gmail.com>
Cc: bioc-devel using r-project.org
Subject: Re: [Bioc-devel] Bioconductor 3.19 db0s, OrgDbs, and TxDbs now available

winging it here tim

> select(Homo.sapiens, keys="ENSG00000198727", keytype="ENSEMBL",
columns=c("GENENAME", "GENEID", "CDSCHROM", "SYMBOL")) 'select()' returned 1:1 mapping between keys and columns
          ENSEMBL     GENENAME SYMBOL CDSCHROM GENEID
1 ENSG00000198727 cytochrome b   CYTB     <NA>   4519
> select(Homo.sapiens, keys= "MTCYBP1", keytype="SYMBOL",
columns=c("GENENAME", "GENEID", "CDSCHROM", "SYMBOL")) 'select()' returned 1:1 mapping between keys and columns
   SYMBOL            GENENAME CDSCHROM    GENEID
1 MTCYBP1 MT-CYB pseudogene 1     <NA> 100499418

relevant?

On Thu, Mar 28, 2024 at 9:17 AM Tim Triche, Jr. <tim.triche using gmail.com>
wrote:

> Hi Lori and fellow maintainers,
>
> I had a strange experience yesterday where I pulled down genes and 
> transcripts from Homo.sapiens, only to discover that all mitochondrial 
> encoded genes (MT-CYB, MT-CO2, etc) were missing.
>
> Is there an historical reason why this is so? Obviously these 
> transcripts are physiologically important, but beyond that, they’re 
> also used all the time in single cell sequencing to estimate viability.
>
> Best,
>
> --t
>
> > On Mar 28, 2024, at 8:47 AM, Kern, Lori via Bioc-devel <
> bioc-devel using r-project.org> wrote:
> >
> > Hello Bioconductor community,
> >
> > The newest db0, OrgDb, and TxDb annotation packages for the upcoming
> Bioconductor 3.19 release are up and available for download in the 
> devel version of Bioconductor.
> >
> > The deadline for submitting contributed annotation packages will be
> Wednesday April 17 th.
> >
> > The new db0 packages are:
> >
> > anopheles.db0_3.19.0.tar.gz
> > arabidopsis.db0_3.19.0.tar.gz
> > bovine.db0_3.19.0.tar.gz
> > canine.db0_3.19.0.tar.gz
> > chicken.db0_3.19.0.tar.gz
> > chimp.db0_3.19.0.tar.gz
> > ecoliK12.db0_3.19.0.tar.gz
> > ecoliSakai.db0_3.19.0.tar.gz
> > fly.db0_3.19.0.tar.gz
> > human.db0_3.19.0.tar.gz
> > malaria.db0_3.19.0.tar.gz
> > mouse.db0_3.19.0.tar.gz
> > pig.db0_3.19.0.tar.gz
> > rat.db0_3.19.0.tar.gz
> > rhesus.db0_3.19.0.tar.gz
> > worm.db0_3.19.0.tar.gz
> > xenopus.db0_3.19.0.tar.gz
> > yeast.db0_3.19.0.tar.gz
> > zebrafish.db0_3.19.0.tar.gz
> >
> > The new OrgDb packages are:
> >
> > GO.db_3.19.0.tar.gz
> > org.Ag.eg.db_3.19.0.tar.gz
> > org.At.tair.db_3.19.0.tar.gz
> > org.Bt.eg.db_3.19.0.tar.gz
> > org.Ce.eg.db_3.19.0.tar.gz
> > org.Cf.eg.db_3.19.0.tar.gz
> > org.Dm.eg.db_3.19.0.tar.gz
> > org.Dr.eg.db_3.19.0.tar.gz
> > org.EcK12.eg.db_3.19.0.tar.gz
> > org.EcSakai.eg.db_3.19.0.tar.gz
> > org.Gg.eg.db_3.19.0.tar.gz
> > org.Hs.eg.db_3.19.0.tar.gz
> > org.Mm.eg.db_3.19.0.tar.gz
> > org.Mmu.eg.db_3.19.0.tar.gz
> > org.Pt.eg.db_3.19.0.tar.gz
> > org.Rn.eg.db_3.19.0.tar.gz
> > org.Sc.eg.db_3.19.0.tar.gz
> > org.Ss.eg.db_3.19.0.tar.gz
> > org.Xl.eg.db_3.19.0.tar.gz
> > Orthology.eg.db_3.19.0.tar.gz
> > PFAM.db_3.19.0.tar.gz
> >
> > The new TxDb packages are:
> >
> > TxDb.Hsapiens.UCSC.hg38.refGene_3.19.0.tar.gz
> > TxDb.Mmusculus.UCSC.mm39.refGene_3.19.0.tar.gz
> >
> > Thank you
> >
> >
> > Lori Shepherd - Kern
> >
> > Bioconductor Core Team
> >
> > Roswell Park Comprehensive Cancer Center
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> >
> > This email message may contain legally privileged and/or 
> > confidential
> information.  If you are not the intended recipient(s), or the 
> employee or agent responsible for the delivery of this message to the 
> intended recipient(s), you are hereby notified that any disclosure, 
> copying, distribution, or use of this email message is prohibited.  If 
> you have received this message in error, please notify the sender 
> immediately by e-mail and delete this email message from your computer. Thank you.
> >    [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list 
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/bi
> > oc-devel__;!!K-Hz7m0Vt54!ixxrV4cynVQr_14T7XsAJir0gOIlLduVfG5aOUHpbF0
> > cO2xJulG_Fb0BdHs7hb-iOay_QMdEi_zp2wWMcftbdXE$
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/bioc
> -devel__;!!K-Hz7m0Vt54!ixxrV4cynVQr_14T7XsAJir0gOIlLduVfG5aOUHpbF0cO2x
> JulG_Fb0BdHs7hb-iOay_QMdEi_zp2wWMcftbdXE$
>

--
The information in this email is intended only for the p...{{dropped:15}}

_______________________________________________
Bioc-devel using r-project.org mailing list
https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/bioc-devel__;!!K-Hz7m0Vt54!ixxrV4cynVQr_14T7XsAJir0gOIlLduVfG5aOUHpbF0cO2xJulG_Fb0BdHs7hb-iOay_QMdEi_zp2wWMcftbdXE$ 


More information about the Bioc-devel mailing list