[BioC] probelm of column names when using biomaRt

Steffen sdurinck at lbl.gov
Mon Oct 29 17:16:28 CET 2007


Hi Li,

The ordering error was a problem at the BioMart server side and has now 
been fixed by the BioMart developers.
Again thanks for reporting this error.

Cheers,
Steffen


Li Lingdu wrote:
> Hello R gurus,
>
> I used R "biomaRt" package to extract some annotation information about TNF
> gene,
>
> the code I used is as below:
>
> -------------
> library(biomaRt)
> usemaRt(biomart="ensembl",dataset="hsapiens_gene_ensembl")->en
> listAttributes(mart=en,category="Structures")$name->attributes
> getBM(attributes=attributes, filters="entrezgene",values=7124,
> mart=en)->temp
> ----------------
>
> but the column names of the returned data frame is in confusion, head
> information of the temp object is:
> ---
>  struct_biotype cdna_length cds_length str_chrom_name constitutive
> exon_cds_end exon_cds_start exon_chrom_end exon_chrom_start
> exon_coding_end exon_coding_start rank exon_stable_id external_db_name
> struct_external_gene_id gene_chrom_end gene_chrom_start gene_stable_id
> peptide_length transcript_chrom_end transcript_chrom_start
> transcript_chrom_strand transcript_count transcript_stable_id
> translation_stable_id
> 1 protein_coding 1669 702 c6_COX 1 186 1 31678370 31678016 31678370
> 31678185 1 ENSE00001495979 Uniprot/SWISSPROT TNFA_HUMAN 31680778
> 31678016 ENSG00000206328 233 31680778 31678016 1 ENST00000383302
> ENSP00000372790 1
> 2 protein_coding 1669 702 c6_COX 1 232 187 31679022 31678977 31679022
> 31678977 2 ENSE00001495978 Uniprot/SWISSPROT TNFA_HUMAN 31680778
> 31678016 ENSG00000206328 233 31680778 31678016 1 ENST00000383302
> ENSP00000372790 1
> 3 protein_coding 1669 702 c6_COX 1 280 233 31679257 31679210 31679257
> 31679210 3 ENSE00001495976 Uniprot/SWISSPROT TNFA_HUMAN 31680778
> 31678016 ENSG00000206328 233 31680778 31678016 1 ENST00000383302
> ENSP00000372790 1
> 4 protein_coding 1669 702 c6_COX 1 702 281 31680778 31679559 31679980
> 31679559 4 ENSE00001495975 Uniprot/SWISSPROT TNFA_HUMAN 31680778
> 31678016 ENSG00000206328 233 31680778 31678016 1 ENST00000383302
> ENSP00000372790 1
> 5 protein_coding 1685 702 6 1 186 1 31651683 31651314 31651683
> 31651498 1 ENSE00001469490 HUGO TNF 31654092 31651314 ENSG00000204490
> 233 31654092 31651314 1 ENST00000376122 ENSP00000365290 1
> 6 protein_coding 1685 702 6 1 232 187 31652335 31652290 31652335
> 31652290 2 ENSE00001469486 HUGO TNF 31654092 31651314 ENSG00000204490
> 233 31654092 31651314 1 ENST00000376122 ENSP00000365290 1
> ---
>
> obviously, the last column is wrong.
>
> I think the problem happens when giving cloumn names for the retrieved
> data.frame,
> the order of the retrieved data.frame is not exactly the order of the
> attributes.
>
> what happened?
>
> Any suggestions or helps would be greatly appreciated!
>
>
> LiGang
>
>
> -----
> My platform:
>
> platform       i386-pc-mingw32
> arch           i386
> os             mingw32
> system         i386, mingw32
> status
> major          2
> minor          6.0
> year           2007
> month          10
> day            03
> svn rev        43063
> language       R
> version.string R version 2.6.0 (2007-10-03)
>
> LC_COLLATE=Chinese_People's Republic of China.936;
> LC_CTYPE=Chinese_People's Republic of China.936;
> LC_MONETARY=Chinese_People's Republic of China.936;
> LC_NUMERIC=C;
> LC_TIME=Chinese_People's Republic of China.936
> ---------------
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list