[BioC] Biomart query in Web interface Vs. biomaRt package?
Steffen
sdurinck at lbl.gov
Fri Oct 5 18:36:14 CEST 2007
Hi Jeremie,
Many thanks for reporting this. Yes the BioMart web interfaces and
biomaRt results should be identical if the same query is used.
The query you sent via the web interface was (you can see this by
clicking the XML button):
<Query virtualSchemaName = "default" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
<Filter name = "biol_process" value = "GO:0006996"/>
<Attribute name = "ensembl_gene_id" />
<Attribute name = "ensembl_transcript_id" />
<Attribute name = "hgnc_symbol" />
<Attribute name = "external_gene_id" />
</Dataset>
</Query>
While the query via biomaRt was (you can see this by setting verbose =
TRUE):
> getBM(attributes = "external_gene_id", filters = "go", values
="GO:0006996", mart = human, verbose=TRUE)
<?xml version='1.0' encoding='UTF-8'?><!DOCTYPE Query><Query
virtualSchemaName = 'default' uniqueRows = '1' count = '0'
datasetConfigVersion = '0.6' requestid= "biomaRt"> <Dataset name =
'hsapiens_gene_ensembl'><Attribute name = 'external_gene_id'/><Filter
name = 'go' value = 'GO:0006996' /></Dataset></Query>
These queries use different filter names and indeed give different
results but I'm not sure if this is intended. We should contact the
Ensembl helpdesk to report the inconsistency so we can figure out what's
going on.
Cheers,
Steffen
J.J.P.Lebrec at lumc.nl wrote:
> Hi,
>
> Using the web based Biomart tool (
> http://www.ensembl.org/biomart/martview/ ) in database=Ensembl 46,
> dataset=Homo sapiens Genes (NCBI 36), I have manually extracted all
> unique genes' 'External Gene ID' using GO pathway GO:0006996 as a
> filter. I obtained 1141 unique genes.
>
> I tried to automate the process using the BiomaRt package with the below
> query which only yielded 9 unique genes!
>
>
>> human = useMart("ensembl", dataset = "hsapiens_gene_ensembl")
>>
> Checking attributes and filters ... ok
>
>> getBM(attributes = "external_gene_id", filters = "go", values =
>>
> "GO:0006996", mart = human)
> external_gene_id
> 1 KIF3A
> 2 HPS3
> 3 HPS3
> 4 DTNBP1
> 5 DTNBP1
> 6 KIF5C
> 7 KIF4A
> 8 HPS1
> 9 HPS6
> 10 HPS6
> 11 HPS6
> 12 KIF25
> 13 HPS4
>
>> sessionInfo()
>>
> R version 2.5.1 (2007-06-27)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=Fr
> ench_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>
> attached base packages:
> [1] "stats" "graphics" "grDevices" "utils" "datasets"
> "methods"
> [7] "base"
>
> other attached packages:
> biomaRt RCurl XML
> "1.10.1" "0.8-0" "1.9-0"
>
>
> I thought the two queries to be equivalent, could you please tell me
> what I am doing wrong here?
>
> Many thanks in advance,
>
> Jeremie
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list