[BioC] biomaRt manual

Thu Mar 29 13:35:25 CEST 2007

On Thursday 29 March 2007 07:28, James W. MacDonald wrote:
> Hi Weiwei,
>
> Weiwei Shi wrote:
> > Sorry :) when I am composing the following email, I did not realize
> > there are a couple of replies now. I read the manual carefully but I
> > am still having some questions like this:
> >
> > For example,
> >
> >>getBM(attributes=c("affy_hg_u95a", "entrezgene"), filters="affy_hg_u95a",
> >> values=head(ids2), mart=human)
> >
> >   affy_hg_u95a entrezgene
> > 1     31308_at         NA
> > 2     31310_at       2741
> > 3     31312_at       9312
> >
> >>head(ids2)
> >
> > [1] "31307_at"   "31308_at"   "31309_r_at" "31310_at"   "31311_at"
> > [6] "31312_at"
> >
> >>getBM(attributes=c("affy_hg_u95a", "entrezgene"), filters="affy_hg_u95a",
> >> values="31307_at", mart=human)
> >
> > NULL
> >
> > I am confused by "NULL" and "NA". I am wondering about the difference b/w
> > them.
>
> Steffen Durinck will know better, but I believe NULL means that Ensembl
> doesn't think that probeset maps to anything (e.g., there is nothing
> available), and NA means that there is no Entrez Gene ID for that probeset.
>
> For instance, if you pull the Entrez Gene ID for 31307_at from the
> hgu95aENTREZID environment, it lists 9594, but if you search Entrez Gene
> for that ID it says it has been discontinued.
>
> > Another question is how to make >8000 queries faster though I read
> > some from previous posts.

Make sure that you really need to make 8000 queries.  It is much faster to 
make one or a few large queries than to make many small ones.  

Sean