[BioC] Fwd: biomaRt column order
steffen at stat.Berkeley.EDU
steffen at stat.Berkeley.EDU
Fri Jul 11 06:13:15 CEST 2008
Hi Mark,
The main problem here is that attributes from different attribute pages
are retrieved and this is not supported by the webservice though such
queries are possible and useful especially for what we do in Bioconductor.
To get an idea what attribute pages are you could check out the BioMart
web interfaces at e.g. http://www.ensembl.org
They are there to group attributes of a similar type together and display
in one webpage ...this makes less sense for command line use like biomaRt.
The column names are returned by the webservice so this problem will have
to be solved there. Though by using the attributes for chromosome_name
and ensembl_gene_id from the sequence attribute page the query should
return the column names correctly.
To see with biomaRt all attributes that belong to one page you could do:
listAttributes(mart, category="Sequences")
If you change your query as follows the column names should be in correct
order:
b<-getBM(c("sequence_gene_stable_id","sequence_str_chrom_name",
"sequence_biotype","sequence_exon_chrom_start","sequence_exon_chrom_end")
,filters="ensembl_gene_id",values="ENSG00000197530",mart=mart)
You'll get:
gene_stable_id str_chrom_name struct_biotype exon_chrom_start
exon_chrom_end
1 ENSG00000197530 1 protein_coding 1540747
1540876
2 ENSG00000197530 1 protein_coding 1541751
1541857
3 ENSG00000197530 1 protein_coding 1548632
1548942
4 ENSG00000197530 1 protein_coding 1549017
1549188
Cheers,
Steffen
>
>
> Begin forwarded message:
>
>> From: Mark Robinson <mrobinson at wehi.EDU.AU>
>> Date: 5 July 2008 9:13:48 AM
>> To: bioconductor at stat.math.ethz.ch
>> Subject: [BioC] biomaRt column order
>>
>> Dear list.
>>
>> I'm using biomaRt to do a fairly simple query against the Ensembl
>> human database. I get returned a table with column names that don't
>> match the data in the columns. See below.
>>
>> I can reshuffle them afterwards to make them, but thats not ideal.
>>
>> Am I doing something wrong?
>>
>> Thanks,
>> Mark
>>
>>
>>
>>
>> > library(biomaRt)
>> Loading required package: RCurl
>> > mart=useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
>> Checking attributes and filters ... ok
>> > mart
>> Object of class 'Mart':
>> Using the ensembl BioMart database
>> Using the hsapiens_gene_ensembl dataset
>> > b<-
>> getBM
>> (c
>> ("ensembl_gene_id
>> ","chromosome_name
>> ","sequence_biotype
>> ","sequence_exon_chrom_start
>> ","sequence_exon_chrom_end
>> "),filters="ensembl_gene_id",values="ENSG00000197530",mart=mart)
>> > dim(b)
>> [1] 25 5
>> > b[1:10,]
>> ensembl_gene_id chromosome_name struct_biotype exon_chrom_start
>> 1 protein_coding 1542803 1542958 ENSG00000197530
>> 2 protein_coding 1548674 1548942 ENSG00000197530
>> 3 protein_coding 1549017 1549188 ENSG00000197530
>> 4 protein_coding 1550038 1550144 ENSG00000197530
>> 5 protein_coding 1550234 1550428 ENSG00000197530
>> 6 protein_coding 1550529 1550671 ENSG00000197530
>> 7 protein_coding 1551893 1551997 ENSG00000197530
>> 8 protein_coding 1552080 1552242 ENSG00000197530
>> 9 protein_coding 1552317 1552450 ENSG00000197530
>> 10 protein_coding 1552539 1552687 ENSG00000197530
>> exon_chrom_end
>> 1 1
>> 2 1
>> 3 1
>> 4 1
>> 5 1
>> 6 1
>> 7 1
>> 8 1
>> 9 1
>> 10 1
>> > sessionInfo()
>> R version 2.7.1 (2008-06-23)
>> i386-apple-darwin8.10.1
>>
>> locale:
>> en_AU.UTF-8/en_AU.UTF-8/C/C/en_AU.UTF-8/en_AU.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] biomaRt_1.14.0 RCurl_0.9-3
>>
>> loaded via a namespace (and not attached):
>> [1] XML_1.95-2
>>
>>
>>
>> ------------------------------
>> Mark Robinson
>> Epigenetics Laboratory, Garvan
>> Bioinformatics Division, WEHI
>> e: m.robinson at garvan.org.au
>> e: mrobinson at wehi.edu.au
>> p: +61 (0)3 9345 2628
>> f: +61 (0)3 9347 0852
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> ------------------------------
> Mark Robinson
> Epigenetics Laboratory, Garvan
> Bioinformatics Division, WEHI
> e: m.robinson at garvan.org.au
> e: mrobinson at wehi.edu.au
> p: +61 (0)3 9345 2628
> f: +61 (0)3 9347 0852
> ------------------------------
>
>
>
>
More information about the Bioconductor
mailing list