[BioC] reverse complement or no reverse complemnt on biomaRt / biomart.org

James W. MacDonald jmacdon at med.umich.edu
Tue Oct 13 16:21:46 CEST 2009


Hi Tefina,

Tefina Paloma wrote:
> Dear Jim,
> 
> Do you know if these sequences are sense or antisense?
> If you export the sequence via biomart (via the webpage), you get the following:
> 
>> ENST00000280193 utr5:KNOWN_protein_coding
> CGGGGAAGGGGAGGGAGGAGGGGGACGAGGGCTCTGGCGGGTTTGGAGGGGCTGAACATC
> GCGGGGTGTTCTGGTGTCCCCCGCCCCGCCTCTCCAAAAAGCTACACCGACGCGGACCGC
> GGCGGCGTCCTCCCTCGCCCTCGCTTCACCTCGCGGGCTCCGAATGCGGGGAGCTCGGAT
> GTCCGGTTTCCTGTGAGGCTTTTACCTGACACCCGCCGCCTTTCCCCGGCACTGGCTGGG
> AGGGCGCCCTGCAAAGTTGGGAACGCGGAGCCCCGGACCCGCTCCCGCCGCCTCCGGCTC
> GCCCAGGGGGGGTCGCCGGGAGGAGCCCGGGGGAGAGGGACCAGGAGGGGCCCGCGGCCT
> CGCAGGGGCGCCCGCGCCCCCACCCCTGCCCCCGCCAGCGGACCGGTCCCCCACCCCCGG
> TCCTTCCACC
> 
>> 5' Flanking sequence chromosome:GRCh37:4:177713896:177713945:1
> AAGTGAGAGGAGCCGGGCCGCGGGCGCTGCGGCGGGGGCGCTGGCGGCGG

How are you getting this? I get the same thing as the web service, as I 
noted yesterday:

 > getSequence(id = c("ENST00000280193"), type 
="ensembl_transcript_id",seqType = "transcript_flank", upstream = 50, 
mart =mart)
                                     transcript_flank ensembl_transcript_id
1 CCGCCGCCAGCGCCCCCGCCGCAGCGCCCGCGGCCCGGCTCCTCTCACTT       ENST00000280193
 > sessionInfo()
R version 2.10.0 Under development (unstable) (2009-09-21 r49780)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
  [1] biomaRt_2.1.0        annaffy_1.17.2       affycoretools_1.17.4
  [4] KEGG.db_2.3.0        GO.db_2.3.0          RSQLite_0.7-2
  [7] DBI_0.2-4            AnnotationDbi_1.7.17 affy_1.23.9
[10] Biobase_2.5.6

loaded via a namespace (and not attached):
  [1] affyio_1.13.5        annotate_1.23.2      Biostrings_2.13.50
  [4] Category_2.11.4      gcrma_2.17.2         genefilter_1.25.7
  [7] GOstats_2.11.3       graph_1.23.6         GSEABase_1.7.3
[10] IRanges_1.3.89       limma_2.19.4         preprocessCore_1.7.9
[13] RBGL_1.21.12         RCurl_0.98-1         splines_2.10.0
[16] survival_2.35-7      tools_2.10.0         XML_2.5-1
[19] xtable_1.5-5

Best,

Jim




> 
> So, in contrast to the web-view, the flanking sequence is reverse complemented.
> Basically it is just a problem of correct definition and assignment.
> So which sequences are sense and which are antisense.
> 
> Best,
> Tefina
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826



More information about the Bioconductor mailing list