[BioC] biomaRt error

Iain Gallagher iaingallagher at btopenworld.com
Thu Jan 14 10:51:01 CET 2010


Thank you Rhoda.

Very useful.

Cheers

Iain

--- On Thu, 14/1/10, Rhoda Kinsella <rhoda at ebi.ac.uk> wrote:

> From: Rhoda Kinsella <rhoda at ebi.ac.uk>
> Subject: Re: [BioC] biomaRt error
> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> Cc: bioconductor at stat.math.ethz.ch
> Date: Thursday, 14 January, 2010, 9:46
> Hi Iain
> Yes there seems to have been problems connecting to the
> www.biomart.org registry which have now been resolved by the
> Biomart team at the OICR. As an alternative,  you can
> also access the Ensembl databases from the Ensembl biomart
> installation at www.ensembl.org using the following biomaRt
> commands:
> 
> > library(biomaRt)
> > listMarts(host = "www.ensembl.org");
> biomart version
> 1 ENSEMBL_MART_ENSEMBL Ensembl 56
> 2 ENSEMBL_MART_SNP Ensembl variation 56
> 3 ENSEMBL_MART_FUNCGEN Ensembl functional genomics 56
> 4 ENSEMBL_MART_VEGA Vega 36
> 5 REACTOME Reactome
> 6 wormbase_current WormBase (CSHL US)
> 7 pride PRIDE (EBI UK)
> > mart = useMart("ENSEMBL_MART_ENSEMBL", host =
> "www.ensembl.org")
> etc...
> 
> Regards
> Rhoda
> 
> 
> On 14 Jan 2010, at 09:37, Iain Gallagher wrote:
> 
> > Seems to be ok this morning. Ran the same query with
> no problems. Perhaps some transient changes at the server
> end.
> > 
> > Cheers
> > 
> > Iain
> > 
> > --- On Wed, 13/1/10, Paul Leo <p.leo at uq.edu.au>
> wrote:
> > 
> >> From: Paul Leo <p.leo at uq.edu.au>
> >> Subject: Re: [BioC] biomaRt error
> >> To: "Iain Gallagher" <iaingallagher at btopenworld.com>
> >> Cc: bioconductor at stat.math.ethz.ch
> >> Date: Wednesday, 13 January, 2010, 23:45
> >> Dito.
> >> Actually 3 days ago some of my old queries
> started
> >> returning different
> >> numbers of rows
> >> 
> >> 
> >>
> length(unique(ann.all[ann.all[,"gene_biotype"]=="protein_coding"
> >> ,
> >> "ensembl_gene_id"]))
> >> [1] 19940
> >> 
> >>> 
> >>
> length(unique(ann.all.entrez[ann.all.entrez[,"gene_biotype"]=="protein_coding"
> >> , "ensembl_gene_id"])) # 22836
> >> [1] 22836
> >> 
> >> vs for months before hand and 4 days previous
> >> 
> >>
> length(unique(ann.all[ann.all[,"gene_biotype"]=="protein_coding"
> >> ,
> >> "ensembl_gene_id"]))
> >> [1] 22836
> >> 
> >>> 
> >>
> length(unique(ann.all.entrez[ann.all.entrez[,"gene_biotype"]=="protein_coding"
> >> , "ensembl_gene_id"])) # 22836
> >> [1] 22836
> >> 
> >> FOR
> >> a.filter<-c( "chromosome_name", "start" ,
> "end")
> >> fil.vals<-list(chrom,  low.cut, high.cut)
> >> 
> >> ann.all<-getBM(attributes =
> >>
> c("ensembl_gene_id","external_gene_id","chromosome_name","start_position","end_position","strand","gene_biotype","mgi_symbol","description"),
> >> filters = a.filter, values=fil.vals, mart = mart)
> >> 
> >> ann.all.entrez<-getBM(attributes =
> >> c(
> >>
> "ensembl_gene_id","external_gene_id","ensembl_transcript_id","chromosome_name","start_position","end_position","strand","gene_biotype","entrezgene","mgi_symbol","description"),
> >> filters = a.filter, values=fil.vals, mart = mart)
> >> 
> >> No idea why the first query was affected and not
> the
> >> second.... This did
> >> coincide with a bioC update I did two days ago...
> but that
> >> may be a
> >> coincidence....
> >> 
> >> 
> >> 
> >> 
> >> Today I get the same error
> >>> listMarts()
> >> Error: non-BioMart die():
> >> not well-formed (invalid token) at line 1, column
> 11797,
> >> byte 11797
> >> at /usr/lib/perl5/XML/Parser.pm line 187
> >> 
> >>   does not seem to be XML, nor to
> identify a file
> >> name
> >> 
> >> 
> >> 
> >> 
> >>> sessionInfo()
> >> R version 2.10.1 (2009-12-14)
> >> x86_64-pc-linux-gnu
> >> 
> >> locale:
> >> [1] LC_CTYPE=en_AU.UTF-8
> >>    LC_NUMERIC=C
> >> 
> >> [3] LC_TIME=en_AU.UTF-8
> >> LC_COLLATE=en_AU.UTF-8
> >> [5] LC_MONETARY=C
> >>   LC_MESSAGES=en_AU.UTF-8
> >> [7] LC_PAPER=en_AU.UTF-8
> >>    LC_NAME=C
> >> 
> >> [9] LC_ADDRESS=C
> >>    LC_TELEPHONE=C
> >> 
> >> [11] LC_MEASUREMENT=en_AU.UTF-8
> LC_IDENTIFICATION=C
> >> 
> >> 
> >> attached base packages:
> >> [1] stats     graphics 
> grDevices
> >> utils     datasets 
> methods
> >> base
> >> 
> >> other attached packages:
> >> [1] org.Mm.eg.db_2.3.6  RSQLite_0.8-1
> >>    DBI_0.2-5
> >> 
> >> [4] AnnotationDbi_1.8.1 Biobase_2.6.1
> >>    biomaRt_2.2.0
> >> 
> >> loaded via a namespace (and not attached):
> >> [1] RCurl_1.3-1  tcltk_2.10.1 tools_2.10.1
> >> XML_2.6-0
> >> 
> >> 
> >> 
> >> -----Original Message-----
> >> From: Iain Gallagher <iaingallagher at btopenworld.com>
> >> To: bioconductor at stat.math.ethz.ch
> >> Subject: [BioC] biomaRt error
> >> Date: Wed, 13 Jan 2010 19:00:14 +0000 (GMT)
> >> 
> >> Hello List
> >> 
> >> Can anyone shed any light on the following biomaRt
> error?
> >> Has something changed at Ensembl?
> >> 
> >>> library(biomaRt)
> >>> 
> >>> mart <- useMart("ensembl",
> >> dataset="hsapiens_gene_ensembl")
> >> Error: non-BioMart die():
> >> not well-formed (invalid token) at line 1, column
> 11797,
> >> byte 11797 at /usr/lib/perl5/XML/Parser.pm line
> 187
> >> 
> >>   does not seem to be XML, nor to
> identify a file
> >> name
> >>> sessionInfo()
> >> R version 2.10.1 (2009-12-14)
> >> x86_64-pc-linux-gnu
> >> 
> >> locale:
> >> [1] LC_CTYPE=en_GB.UTF-8
> >>    LC_NUMERIC=C
> >> 
> >> [3] LC_TIME=en_GB.UTF-8
> >> LC_COLLATE=en_GB.UTF-8
> >> [5] LC_MONETARY=C
> >>   LC_MESSAGES=en_GB.UTF-8
> >> [7] LC_PAPER=en_GB.UTF-8
> >>    LC_NAME=C
> >> 
> >> [9] LC_ADDRESS=C
> >>    LC_TELEPHONE=C
> >> 
> >> [11] LC_MEASUREMENT=en_GB.UTF-8
> LC_IDENTIFICATION=C
> >> 
> >> 
> >> attached base packages:
> >> [1] stats     graphics 
> grDevices
> >> utils     datasets
> >> methods   base
> >> 
> >> other attached packages:
> >> [1] biomaRt_2.2.0
> >> 
> >> loaded via a namespace (and not attached):
> >> [1] RCurl_1.3-1 XML_2.6-0
> >>> 
> >> 
> >> Thanks
> >> 
> >> Iain
> >> 
> >> 
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> >> 
> >> 
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> Rhoda Kinsella Ph.D.
> Ensembl Bioinformatician,
> European Bioinformatics Institute (EMBL-EBI),
> Wellcome Trust Genome Campus,
> Hinxton
> Cambridge CB10 1SD,
> UK.
> 
>



More information about the Bioconductor mailing list