[BioC] Understanding the randomness of Biomart

steffen at stat.Berkeley.EDU steffen at stat.Berkeley.EDU
Thu Aug 21 19:37:01 CEST 2008


Hi Nathan,

This how these BioMart systems work.  If you do:

fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters =
c("chromosome_name", "start", "end", "with_hgnc_symbol"), values =
list(as.numeric("9"),19198907, 19357826, TRUE), mart = ensembl)


You'll only retrieve the affy ids once and each one of them will have an
HGNC symbol.

Cheers,
Steffen

> Hi everyone,
>
> I have been playing with the biomaRt package a bit more and I am
> trying to work out what is going on here:
>
> ensembl = useMart("ensembl_mart_47", dataset =
> "hsapiens_gene_ensembl", archive = TRUE)
>
> fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters =
> c("chromosome_name", "start", "end"), values = list(as.numeric("9"),
> 19198907, 19357826), mart = ensembl)
>   affy_hg_u133_plus_2 hgnc_symbol
> 1           226867_at
> 2         205684_s_at
> 3           226867_at     DENND4C
> 4         205684_s_at     DENND4C
> 5           234968_at
> 6           234968_at     DENND4C
>
> fetched = getBM(c("affy_hg_u133_plus_2", "hgnc_symbol"), filters =
> c("chromosome_name", "start", "end"), values = list(as.numeric("9"),
> 33925736, 34088257), mart = ensembl)
>
>   affy_hg_u133_plus_2 hgnc_symbol
> 1
> 2           224789_at
> 3           224789_at      WDR40A
>
> I cannot understand why I am getting 2 rows for some probesets one
> containing a hugo identifier and the other not? And whether there is
> any relevance to this result ( probeset  234968_at ) and why I have
> some results which don't show any probeset at all? Is there a specific
> reason for this or is this just a something that needs to be post
> filtered?
>
> Many thanks in advance.
>
> Nathan
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list