[Bioc-sig-seq] unexpected genes names list using getBM{biomaRt}

James W. MacDonald jmacdon at med.umich.edu
Mon Dec 7 16:20:48 CET 2009


Hi Ramzi,

Ramzi TEMANNI wrote:
> Hi,
> I want to extract the gene names knowing the chromosome and the position for
> each genes:
>> t.cpd[1:10,1:2]
>       CHR.M1 POS.M1
>  [1,] "12"   "140059033"
>  [2,] "19"   "164634640"
>  [3,] "10"   "32347784"
>  [4,] "11"   "30576841"
>  [5,] "2"    "86479831"
>  [6,] "12"   "237019866"
>  [7,] "4"    "76487174"
>  [8,] "20"   "136121868"
>  [9,] "2"    "6255547"
> [10,] "1"    "67658137"
> 
> i use the following commands:
> library(biomaRt)
> mart = useMart("ensembl")
> ensembl = useDataset("hsapiens_gene_ensembl", mart = mart)
> gn.m1<-getBM(attributes= c("hgnc_symbol"),
>        filters=c("chromosome_name","start"),
>        values=list(t.cpd[1:10,1],t.cpd[1:10,2]), mart=ensembl)
> 
> I'm expecting having a list of 10 genes names, but instead i get 8652 genes:
> hgnc_symbol
> 1      OR2M1P
> 2      OR2L1P
> 3   HSD17B7P1
> 4     OR14L1P
> 5       OR2W5
> 6       VN1R5
> ......
> 8649        WFS1
> 8650    SNORD73A
> 8651     SNORA24
> 8652     SNORA26
> 
> Did I miss something ?

Yes. You are giving the start position, but not the end. Without 
explicitly telling the Biomart server where to stop looking for genes, 
where do you think it will stop by default?

Also, several of your coordinates are nonsensical. For instance, chr12 
is only 133851859 bases long, chr20 is 63025520 bases long, etc.

Best,

Jim


> 
> Thanks in advance for your help
> 
> Best Regards,
> Ramzi
> 
> ----------------------------------------------------------------
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioc-sig-sequencing mailing list