[BioC] makeTranscriptDbFromBiomart - yeast - 2micron plasmid missing
Hervé Pagès
hpages at fhcrc.org
Wed Mar 6 09:45:59 CET 2013
Hi Stefanie,
Not a good idea to start a new thread by replying to an old one. This
confuses most thread-aware email client as they will place the new
question deep inside the old thread, and as a result most people won't
see it.
On 03/05/2013 01:34 AM, Stefanie Tauber wrote:
> Dear List,
>
> I am creating a TranscriptDatabase as follows:
>
> library(GenomicFeatures)
> myDB <- makeTranscriptDbFromBiomart(biomart = "ensembl", dataset = "scerevisiae_gene_ensembl", circ_seqs = c(DEFAULT_CIRC_SEQS, "Mito"))
> myDBx <- cdsBy(myDB, by = "tx", use.names = TRUE)
>
> everything fine so far,
> I am just missing 4 ORFs which are present on the 2 micron plasmid. (R0010W, R0020C, R0030W, R0040C)
There doesn't seem to be any 2-micron plasmid in the Yeast reference
genome currently in use by Ensembl:
> seqlengths(myDB)
I II III IV V VI VII VIII
IX X
230218 813184 316620 1531933 576874 270161 1090940 562643
439888 745751
XI XII XIII XIV XV XVI Mito
666816 1078177 924431 784333 1091291 948066 85779
No 2-micron plasmid either in UCSC sacCer3:
> library(BSgenome.Scerevisiae.UCSC.sacCer3)
> Scerevisiae
Yeast genome
|
| organism: Saccharomyces cerevisiae (Yeast)
| provider: UCSC
| provider version: sacCer3
| release date: April 2011
| release name: SGD April 2011 sequence
|
| sequences (see '?seqnames'):
| chrI chrII chrIII chrIV chrV chrVI chrVII
chrVIII
| chrIX chrX chrXI chrXII chrXIII chrXIV chrXV
chrXVI
| chrM
|
| (use the '$' or '[[' operator to access a given sequence)
> seqlengths(Scerevisiae)
chrI chrII chrIII chrIV chrV chrVI chrVII chrVIII
chrIX chrX
230218 813184 316620 1531933 576874 270161 1090940 562643
439888 745751
chrXI chrXII chrXIII chrXIV chrXV chrXVI chrM
666816 1078177 924431 784333 1091291 948066 85779
The 2 Yeast genomes above (Ensembl and UCSC) seem to be the same even
though having the same chromosome lengths is not a guarantee that the
sequences are actually the same.
>
>
> If one puts here http://www.yeastgenome.org/cgi-bin/seqTools "R0010W",
> one finds the following info:
> FLP1/R0010W, ORF, on 2-micron plasmid from coordinates 252 to 1523.
>
>
> While the annotation from Ensembl is imported from SGD, no ORFs are listed for the 2-micron plasmid, and therefore
> also not accessible via makeTranscriptDbFromBiomart.
>
> Any hints what I am getting wrong?
I'm not sure why the 2-micron plasmid was dropped by Ensembl (and UCSC)
but that sounds more like a question for them.
H.
>
> Best,
> Stefanie
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fhcrc.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioconductor
mailing list