[BioC] biomaRt problem
Wolfgang Huber
whuber at embl.de
Tue Aug 3 23:21:40 CEST 2010
Hi Anupam Singha
as always, following the posting guide (output sessionInfo(), and a
reproducible example that does not depend on a private file that exists
only on your computer) would be useful.
Your getBM query seems incomplete, since you do specify an argument for
'values', but not for 'filters'. So, in effect, your values are ignored
and no filtering is performed - the query is made on all ~51,000 genes
in the dataset.
Third, why the attribute "hsapiens_dn" is NA for all genes in the
dataset is a question I need to pass to someone more familiar with this
particular dataset - I will forward it to the Ensembl helpdesk.
Here's a code example:
#----------------
library("biomaRt")
mart = useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")
filters = listFilters(mart)
attrs = listAttributes(mart)
print("hsapiens_dn" %in% filters$name)
print("hsapiens_dn" %in% attrs$name)
res = getBM(attributes = c("ensembl_gene_id", "hsapiens_dn"),
mart = mart)
print(table(is.na(res$hsapiens_dn)))
print(sessionInfo())
#----------------
and its output
#----------------
[1] FALSE
[1] TRUE
TRUE
51726
R version 2.12.0 Under development (unstable) (2010-08-02 r52661)
Platform: x86_64-apple-darwin10.4.0 (64-bit)
locale:
[1] C/C/C/C/C/it_IT
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] biomaRt_2.5.1 fortunes_1.3-7
loaded via a namespace (and not attached):
[1] RCurl_1.4-3 XML_3.1-0
#----------------
Best wishes
Wolfgang
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
On Jul/27/10 3:51 PM, anupam sinha wrote:
> Hi all,
> I am trying to download dN/dS values for human genes from
> ensemble using biomaRt. I have used codes from the website :
>
> http://www.r-bloggers.com/biomart-and-biomart/
>
>> library("biomaRt")
>> mart<- useMart(biomart="ensembl", dataset="hsapiens_gene_
> ensembl")
>> genes<- read.csv("file.txt") (this file contains hgnc gene symbols for
> Homo sapiens)
>> results<- getBM(attributes = c("ensembl_gene_id","hsapiens_dn"),values =
> genes$hsapiens_dn, mart = mart)
>
> But all the values of hsapiens_dn are shown to be "NA". The first few lines
> of the output
>
>> head(results,10)
> ensembl_gene_id hsapiens_dn
> 1 ENSG00000215781 NA
> 2 ENSG00000243259 NA
> 3 ENSG00000225566 NA
> 4 ENSG00000189096 NA
> 5 ENSG00000215750 NA
> 6 ENSG00000212884 NA
> 7 ENSG00000212886 NA
> 8 ENSG00000229617 NA
> 9 ENSG00000241176 NA
> 10 ENSG00000215705 NA
>
> Can anyone please tell me where am I going wrong ? . Thanks in advance for
> any suggestions
>
More information about the Bioconductor
mailing list