[BioC] Is retrieving exon sequences with biomaRt a random process?
Wolfgang Huber
huber at ebi.ac.uk
Mon Apr 13 11:30:14 CEST 2009
Dear Jürg
thank you for the feedback! Can you send us a reproducible example -
this may better help us figuring what is going on. In the example you
posted, what is the object "ensembl" and how did you generate it?
I tried the following example, which is as similar to yours as I could
think of. I could not reproduce your problem, i.e. I got consistent
(i.e. non-random) results, as shown below:
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")
res = lapply(sequence(50), function(i)
R 2.8.1
> res
.... (46 more times NULL)
> sessionInfo()
R version 2.8.1 (2008-12-22)
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_1.16.0
loaded via a namespace (and not attached):
[1] RCurl_0.92-0 XML_1.99-0
Today's R + bioC devel
> res
[1] gene_exon entrezgene
<0 rows> (or 0-length row.names)
[1] gene_exon entrezgene
<0 rows> (or 0-length row.names)
[1] gene_exon entrezgene
<0 rows> (or 0-length row.names)
.... (46 times the same)
[1] gene_exon entrezgene
<0 rows> (or 0-length row.names)
Also, when using a different Entrez Gene ID, I get a non-trivial result,
e.g. with
> g=
> str(g)
'data.frame': 23 obs. of 2 variables:
$ gene_exon : chr
__truncated__ ...
$ entrezgene: int 1499 1499 1499 1499 1499 1499 1499 1499 1499 1499 ...
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-04-12 r48319)
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] biomaRt_1.99.9
loaded via a namespace (and not attached):
[1] RCurl_0.94-1 XML_2.3-0 tools_2.10.0
Straubhaar, Juerg wrote:
> I am using the following code to retrieve the exon sequences of gene Tcfap2b with GeneID:21419. There are 8 exons for this gene.
> for (i in sequence(50)) {
> + x <- getSequence(id=21419,type="entrezgene",seqType="gene_exon",mart=ensembl)
> + if (is.null(x)) print('NULL result')
> + if (!is.null(x)) print("Correct result")
> + }
> This gives 44 NULL results and 6 correct results. 'correct' means getSequence() outputs the sequences of the exons.
>> sessionInfo()
> R version 2.8.1 (2008-12-22)
> x86_64-pc-linux-gnu
> locale:
> C
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] biomaRt_1.16.0
> loaded via a namespace (and not attached):
> [1] RCurl_0.94-0 XML_1.99-0 tools_2.8.1
> Thank you,
> Juerg Straubhaar, Umass Med School
> [[alternative HTML version deleted]]
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
Wolfgang Huber EMBL-EBI http://www.ebi.ac.uk/huber
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: biomaRt-Straubhaar.txt
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090413/8db958c1/attachment.txt>
More information about the Bioconductor
mailing list