[BioC] non-BiomaRt die
dtenenba at fhcrc.org
Sat Jan 25 20:52:53 CET 2014
----- Original Message -----
> From: "Waqasuddin Khan [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, waqasuddin at picb.ac.cn
> Sent: Friday, January 24, 2014 6:42:06 PM
> Subject: [BioC] non-BiomaRt die
> I tried to retrieve ensembl_gene_id and go_term for my arabidopsis
> thaliana gene from my gene_name list:
> > head(gene_name)
> 1 ANAC001
> 2 DCL1
> 3 MIR838A
> 4 AT1G01073
> 5 IQD18
> 6 AT1G01115
> > unimart = useMart("plants_mart_20",dataset="athaliana_eg_gene")
> > getBM(attributes=c("ensembl_gene_id",
> > "go_accession"),filters=c("ensembl_gene_id"),values=gene_name,mart=unimart)
> but got the folloeing error? I did not figure it out? is it an error
> from my side or from the biomart server?
> -- output of sessionInfo():
> Error in getBM(attributes = c("ensembl_gene_id", "go_accession"),
> filters = c("ensembl_gene_id"), :
> Query ERROR: caught BioMart::Exception: non-BioMart die():
> not well-formed (invalid token) at line 1, column 21728, byte 21728
> at /usr/lib/perl5/XML/Parser.pm line 187
This error is happening on the server side. I know this because this is a perl error and there is no perl on the client side.
The question is why it is happening. My guess is there is an invalid item in your gene_name vector.
It could be a blank line, or something that's too long, or invalid characters. Things to try:
tools::showNonASCII(gene_name) # non-ascii characters?
max(nchar(gene_name)) # length of the longest gene name
which(nchar(gene_name)==0) # which lines are blank?
> Sent via the guest posting facility at bioconductor.org.
> Bioconductor mailing list
> Bioconductor at r-project.org
> Search the archives:
More information about the Bioconductor