[BioC] BiomaRt Ensembl RefSeq query error
georg.otto at imm.ox.ac.uk
Tue Jan 21 13:09:52 CET 2014
I am trying to query 14005 Ensembl gene IDs for their Refseq annotations
using this code (I can send the gene IDs upon request):
ensembl <- useMart("ensembl", dataset = 'mmusculus_gene_ensembl')
getBM(attributes = c("ensembl_gene_id",
mart = ensembl, uniqueRows = TRUE)
If I query for the full gene set, many RefSeq IDs are missing (NA), for
example for the gene ENSMUSG00000000567 (sox9), whereas if I query for a
subset, say ensembl.ids[1:12000], all the RefSeq IDs are there. It does
not seem to matter which subset I use, but the size of the subset has to
be smaller than ca. 12000 genes.
Any idea what is going on?
More information about the Bioconductor