[BioC] Problem with getBM function in biomaRt package
Steffen Durinck
durincks at mail.nih.gov
Wed Jul 5 16:31:43 CEST 2006
Hi Luo,
If you request a list as output then biomaRt will do in your case 8000
separate queries to the server. This is not well suited for large query
vectors. Have you tried to use biomaRt with the default output (a
data.frame)?
genelist=getBM(attributes =
c("hgnc_symbol","description"), filter =
"entrezgene",values = igenes, mart = mart)
You should have no problems querying > 8000 ids when using the default
output.
If you do need a list output and have many ids then I would recommend
using biomaRt RMySQL mode.
Best,
Steffen
Luo Weijun wrote:
> Hello all,
> I am trying to get gene symbols and full gene names
> (description) for a long list of (>=8000) genes. I use
> getBM function in biomaRt package. And the code is
> pretty much the same as Jim¡¯s ¡®HowTo: get pretty
> HTML output for my gene list¡¯ vignette. Everything
> works fine when I use a much shorter list (100 genes),
> i.e. igenes= hs95av2Entrezg7[1:100] in the following
> codes. But when igene= hs95av2Entrezg7 (full gene
> list), getBM doesn¡¯t work, and returns an error
> message.
>
>
>> library(biomaRt)
>>
> Loading required package: XML
> Loading required package: RCurl
>
>> mart <- useMart("ensembl", "hsapiens_gene_ensembl")
>>
> Checking attributes and filters ... ok
>
> load('/Users/luow/project/microarraydata/annotation/hs95av2Entrezg7.Rdata')
>
>> igenes=hs95av2Entrezg7
>>
> <escription"), filter = "entrezgene",values = igenes,
> mart = mart, output = "list",na.value ='')
>
> ##(note here my orginal input is:
> genelist=getBM(attributes =
> c("hgnc_symbol","description"), filter =
> "entrezgene",values = igenes, mart = mart, output =
> "list",na.value ='')
> ##and this long line is truncated in the terminal
> screen somehow)
> Error in postForm(paste(mart at host, "?", sep = ""),
> query = xmlQuery) :
> couldn't connect to host
>
>
> Since Jim also suggests that RMySQL is much faster
> than RCurl, I also tried to install RMySQL package,
> but the error messages says there is no such package,
> even though I did see RMySQL is there in the
> contributed package list in all mirror sites of CRAN I
> tried. Not sure what is the problem.
>
>
>> install.packages('RMySQL', repos =
>>
> "http://www.biometrics.mtu.edu/CRAN/")
> Warning in download.packages(pkgs, destdir = tmpd,
> available = available, :
> no package 'RMySQL' at the repositories
>
>
> Here is my session info
>
>> sessionInfo()
>>
> Version 2.3.1 (2006-06-01)
> powerpc-apple-darwin8.6.0
>
> attached base packages:
> [1] "methods" "stats" "graphics" "grDevices"
> "utils" "datasets"
> [7] "base"
>
> other attached packages:
> biomaRt RCurl XML
> "1.6.0" "0.6-2" "0.99-7"
>
>
> I actually can¡¯t even do sessionInfo after the getBM
> line got broken.
>
>> sessionInfo()
>>
> Error in gzfile(file, "rb") : unable to open
> connection
> In addition: Warning messages:
> 1: list.files:
> '/Library/Frameworks/R.framework/Resources/library' is
> not a readable directory
> 2: cannot open compressed file
> '/Library/Frameworks/R.framework/Resources/library/biomaRt/Meta/package.rds'
>
>
>
> Thank you so much for your kind help!
> Weijun
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list