[BioC] biomaRt issues
Leonardo Collado Torres
lcollado at lcg.unam.mx
Tue Sep 8 07:43:08 CEST 2009
Hello BioC users :)
I'm having some trouble with biomaRt with the uniprot database.
#I can execute the following code and everything works fine (with ENSEMBL):
library(biomaRt)
bsub <- useMart( "bacterial_mart_54", dataset = "bac_6_gene")
res <- getBM( attributes=c("start_position", "end_position", "strand",
"status"), filters= c("start", "end"), values = list("1", "100000"),
mart = bsub)
library(lattice)
print(xyplot(end_position~start_position | status, group=strand,
data=res, auto.key=TRUE))
#But then, if I want to retrieve the EC numbers and organism info for
the viral proteins on Uniprot, this should work:
# (I did it first through http://www.ebi.ac.uk/uniprot/biomart/martview
and it worked)
library(biomaRt)
uni <- useMart("uniprot_mart", dataset="UNIPROT")
virus <- getBM(attributes = c("ec_number","organism"), filters =
"superregnum_name", values = "Viruses", mart = uni)
dim(virus)
[1] 0 2
# But the virus object has 0 rows. The same happens if I use
checkFilters = FALSE
# Using the website app, I do get information back.
# If I check only the "organism" attribute, then I do get some information.
virus2 <- getBM(attributes = c("organism"), filters =
"superregnum_name", values = "Viruses", mart = uni)
dim(virus2)
[1] 5063 1
# However, I re did the "virus2" object a few minutes later and got a
different result (I checked around 4 times and got the same numbers):
virus2 <- getBM(attributes = c("organism"), filters =
"superregnum_name", values = "Viruses", mart=uni)
dim(virus2)
[1] 158 1
# Then once more after I typed the above lines on this mail, and I got
the same original result
virus2 <- getBM(attributes = c("organism"), filters =
"superregnum_name", values = "Viruses", mart=uni)
dim(virus2)
[1] 5063 1
# I'm pretty sure that I didn't lose my internet connection on the
meantime, so I don't really know what is causing this error.
# I then tried the same lines on a different machine (different network
too) and at first I got the same 5063 row value, and then I got:
virus2 <- getBM(attributes = c("organism"), filters =
"superregnum_name", values = "Viruses", mart=uni)
dim(virus2)
[1] 8431 1
# Then 5063 again, etc.
In the end, 5063 seems to pop up more frequently, but is it the actual
result? Is there a way to make sure I'm not missing information without
calling getBM multiple times to check that there are no unexpected results?
I had assigned some homework exercises using biomaRt to access Uniprot,
but now I'm confused myself about what's going on :P
Any tips will be great :) Thanks!
Leonardo
# First comp session info
sessionInfo()
R version 2.10.0 Under development (unstable) (2009-07-21 r48968)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
States.1252
[3] LC_MONETARY=English_United States.1252
LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lattice_0.17-25 biomaRt_2.1.0
loaded via a namespace (and not attached):
[1] grid_2.10.0 RCurl_0.98-1 XML_2.5-1
# Second comp session info
sessionInfo()
R version 2.10.0 Under development (unstable) (2009-08-10 r49131)
sparc-sun-solaris2.9
locale:
[1] C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.1.0
loaded via a namespace (and not attached):
[1] RCurl_1.2-0 XML_2.6-0
--
Leonardo Collado Torres, Bachelor in Genomic Sciences
Professor at LCG and member of Dr. Enrique Morett's lab
UNAM Campus Cuernavaca, Mexico
Homepage: http://www.lcg.unam.mx/~lcollado/
Phone: [52] (777) 313-28-05
More information about the Bioconductor
mailing list