[BioC] biomaRt error

Paul Leo p.leo at uq.edu.au
Thu Jan 14 00:45:56 CET 2010


Dito.
Actually 3 days ago some of my old queries started returning different
numbers of rows 

 length(unique(ann.all[ann.all[,"gene_biotype"]=="protein_coding" ,
"ensembl_gene_id"]))
[1] 19940

>
length(unique(ann.all.entrez[ann.all.entrez[,"gene_biotype"]=="protein_coding" , "ensembl_gene_id"])) # 22836
[1] 22836

vs for months before hand and 4 days previous

length(unique(ann.all[ann.all[,"gene_biotype"]=="protein_coding" ,
"ensembl_gene_id"])) 
[1] 22836

>
length(unique(ann.all.entrez[ann.all.entrez[,"gene_biotype"]=="protein_coding" , "ensembl_gene_id"])) # 22836
[1] 22836

FOR
a.filter<-c( "chromosome_name", "start" , "end")
fil.vals<-list(chrom,  low.cut, high.cut)

ann.all<-getBM(attributes =
c("ensembl_gene_id","external_gene_id","chromosome_name","start_position","end_position","strand","gene_biotype","mgi_symbol","description"), filters = a.filter, values=fil.vals, mart = mart)

ann.all.entrez<-getBM(attributes =
c( "ensembl_gene_id","external_gene_id","ensembl_transcript_id","chromosome_name","start_position","end_position","strand","gene_biotype","entrezgene","mgi_symbol","description"), filters = a.filter, values=fil.vals, mart = mart)

No idea why the first query was affected and not the second.... This did
coincide with a bioC update I did two days ago... but that may be a
coincidence....




Today I get the same error
> listMarts()
Error: non-BioMart die(): 
not well-formed (invalid token) at line 1, column 11797, byte 11797
at /usr/lib/perl5/XML/Parser.pm line 187

  does not seem to be XML, nor to identify a file name




> sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base     

other attached packages:
[1] org.Mm.eg.db_2.3.6  RSQLite_0.8-1       DBI_0.2-5          
[4] AnnotationDbi_1.8.1 Biobase_2.6.1       biomaRt_2.2.0      

loaded via a namespace (and not attached):
[1] RCurl_1.3-1  tcltk_2.10.1 tools_2.10.1 XML_2.6-0  



-----Original Message-----
From: Iain Gallagher <iaingallagher at btopenworld.com>
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] biomaRt error
Date: Wed, 13 Jan 2010 19:00:14 +0000 (GMT)

Hello List

Can anyone shed any light on the following biomaRt error? Has something changed at Ensembl?

> library(biomaRt)
> 
> mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")
Error: non-BioMart die(): 
not well-formed (invalid token) at line 1, column 11797, byte 11797 at /usr/lib/perl5/XML/Parser.pm line 187

  does not seem to be XML, nor to identify a file name
> sessionInfo()
R version 2.10.1 (2009-12-14) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] biomaRt_2.2.0

loaded via a namespace (and not attached):
[1] RCurl_1.3-1 XML_2.6-0  
> 

Thanks

Iain


_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list