[BioC] Question about BiomartGeneRegionTrack
Tiphaine Martin
tiphaine.martin at kcl.ac.uk
Fri Feb 21 17:19:42 CET 2014
Hi,
Thanks.In waiting, I did how you told me with rtracklayer ->GRanges
object -> AnnotationTrack() contructor.
I have a question about BiomartGeneRegionTrack. I would to visualize
only at gene level, not at transcript level. But for each gene, I would
like to see each exon and no only one long line with good color and
maybe for the extrem 5'UTR, 3'UTR, the part of exon is thiner.
I tried the different values of option "stacking" (squish,dense,hide)
but none does what I would like and I tried also the option
collapseTranscripts = TRUE; but it creates only one long bar.
What I would like is like the combinaison of "collapseTranscripts =
TRUE" (because It keep separatly the different gene) and
"stacking="dense"" (because I can see the name of my gene and different
exons).
I hope that my explanation is enough understable.
Do you know what I need to use for the options?
I give you the different command lines that I did in R.
Do I make a mistake with options to obtain that ? If it is the case,
could you help me?
Should I use BioMart + AnnotationTrack and other options to do that (I
tried it but I don't succeed to do it)?
> gen="hg19"
> chr="2"
> start=43625705
> end= 43826133
> biomTrack <- BiomartGeneRegionTrack(genome = gen,
+ chromosome = chr, start = start,
+ end = end, name =
"ENSEMBL",stacking="squish")
> biomTrack2 <- BiomartGeneRegionTrack(genome = gen,
+ chromosome = chr, start = start,
+ end = end, name =
"ENSEMBL",stacking="dense")
> biomTrack1 <- BiomartGeneRegionTrack(genome = gen,
+ chromosome = chr, start = start,
+ end = end, name =
"ENSEMBL",stacking="hide")
>martfunc <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
>ensfunc <-
getBM(c("ensembl_gene_id","ensembl_transcript_id","exon_chrom_start",
+ "exon_chrom_end","strand","gene_biotype","external_gene_id"),
+ filters = c("chromosome_name", "start", "end"),
+ values = list(chrEnsembl, start, end), mart=martfunc)
>data_trackfunc <- AnnotationTrack(chr=chr,strand
=ensfunc[,5],start=ensfunc[,3],end=ensfunc[,4],
+ feature=ensfunc[,6],group=ensfunc[,1],id=ensfunc[,7],
+ name = "genes ENSEMBL")
> listtracks=c( biomTrack, biomTrack1, biomTrack2,data_trackfunc)
> plotTracks(listtracks, from=start, to=end,showId=T)
=> create the plot "GViz_NOcollapseOption.png"
>plotTracks(listtracks, from=start, to=end,showId=T,collapseTranscripts
= TRUE, shape = "arrow", transcriptAnnotation = "symbol")
=> Create the plot "Gviz_collapseOption.png"
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8
attached base packages:
[1] parallel grid stats graphics grDevices utils datasets
methods base
other attached packages:
[1] TxDb.Hsapiens.UCSC.hg19.knownGene_2.10.1 GenomicFeatures_1.14.2
[3] AnnotationDbi_1.24.0 Biobase_2.22.0
[5] ggbio_1.10.11 ggplot2_0.9.3.1
[7] BiocInstaller_1.12.0 rtracklayer_1.22.3
[9] GenomicRanges_1.14.4 XVector_0.2.0
[11] IRanges_1.20.6 BiocGenerics_0.8.0
[13] Gviz_1.6.0 biomaRt_2.18.0
loaded via a namespace (and not attached):
[1] Biostrings_2.30.1 biovizBase_1.10.7 bitops_1.0-6
[4] BSgenome_1.30.0 cluster_1.14.4 colorspace_1.2-4
[7] DBI_0.2-7 dichromat_2.0-0 digest_0.6.4
[10] Formula_1.1-1 gridExtra_0.9.1 gtable_0.1.2
[13] Hmisc_3.14-0 labeling_0.2 lattice_0.20-24
[16] latticeExtra_0.6-26 MASS_7.3-29 munsell_0.4.2
[19] plyr_1.8 proto_0.3-10 RColorBrewer_1.0-5
[22] RCurl_1.95-4.1 reshape2_1.2.2 Rsamtools_1.14.3
[25] RSQLite_0.11.4 scales_0.2.3 splines_3.0.2
[28] stats4_3.0.2 stringr_0.6.2 survival_2.37-7
[31] tools_3.0.2 VariantAnnotation_1.8.12 XML_3.95-0.2
[34] zlibbioc_1.8.0
Regards,
Tiphaine
On 21/02/14 13:29, Hahne, Florian wrote:
> Ah, I see! Thanks for the clarification. That makes perfect sense.
> @Martin, I will provide a fix for the Gviz package in the next couple
> of days.
> Florian
>
> From: Michael Lawrence <lawrence.michael at gene.com
> <mailto:lawrence.michael at gene.com>>
> Date: Friday, February 21, 2014 2:23 PM
> To: Florian Hahne <florian.hahne at novartis.com
> <mailto:florian.hahne at novartis.com>>
> Cc: Tiphaine Martin <tiphaine.martin at kcl.ac.uk
> <mailto:tiphaine.martin at kcl.ac.uk>>, "bioconductor at r-project.org
> <mailto:bioconductor at r-project.org>" <bioconductor at r-project.org
> <mailto:bioconductor at r-project.org>>, Michael Lawrence
> <lawrence.michael at gene.com <mailto:lawrence.michael at gene.com>>
> Subject: Re: Gviz problem to extract some track with UcscTrack like
> Broad ChromHMM
>
> Hi Florian,
>
> trackNames,UCSCSession will only return the tracks in the actual
> browser. To get the track names from the table browser, just use
> trackNames,UCSCTableQuery. In other words, UCSCTableQuery is the
> interface to the table browser, while UCSCSession is the interface to
> the actual browser.
>
> Michael
>
>
> On Fri, Feb 21, 2014 at 1:40 AM, Hahne, Florian
> <florian.hahne at novartis.com <mailto:florian.hahne at novartis.com>> wrote:
>
> Hi Martin,
> This seems to be a problem in the rtracklayer package:
>
> > library(rtracklayer)
> > session <- browserSession()
> > genome(session) <- "hg19"
>
> > grep("Broad", trackNames(session))
> integer(0)
>
> > grep("Broad", names(trackNames(session)))
> integer(0)
>
> But:
> > query <- ucscTableQuery(session, "Broad ChromHMM")
> > tableNames(query)
> [1] "wgEncodeBroadHmmGm12878HMM" "wgEncodeBroadHmmH1hescHMM"
> [3] "wgEncodeBroadHmmK562HMM" "wgEncodeBroadHmmHepg2HMM"
> [5] "wgEncodeBroadHmmHuvecHMM" "wgEncodeBroadHmmHmecHMM"
> [7] "wgEncodeBroadHmmHsmmHMM" "wgEncodeBroadHmmNhekHMM"
> [9] "wgEncodeBroadHmmNhlfHMM"
>
> Even though the Broad ChromHMM track exists, it is not listed
> by trackNames(). The output of trackNames is used by Gviz to check
> whether the requested track exists in order to give a more useful
> error message. As a quick fix for now you could simply download
> the track data via rtracklayer into a GRanges object and use that
> as the input for your AnnotationTrack() constructor.
>
> Michael, is that a bug in rtracklayer, or is it intentional that
> trackNames does not list all available tracks? Broad ChromHMM is
> also listed in the table browser:
> http://genome.ucsc.edu/cgi-bin/hgTables?hgsid=363821417&clade=mammal&org=Human&db=hg19&hgta_group=allTracks&hgta_track=wgEncodeBroadHmm&hgta_table=0&hgta_regionType=range&position=chr21%3A33%2C031%2C597-33%2C041%2C570&hgta_outputType=bed&hgta_outFileName=
>
> > sessionInfo()
> R version 3.0.2 Patched (2013-10-27 r64116)
> Platform: i386-apple-darwin12.5.0/i386 (32-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] tools parallel grid stats graphics grDevices utils
> [8] datasets methods base
>
> other attached packages:
> [1] rtracklayer_1.22.0 GenomicRanges_1.14.4 XVector_0.2.0
> [4] IRanges_1.20.6 Biobase_2.22.0 BiocGenerics_0.8.0
> [7] Gviz_1.6.0 BiocInstaller_1.12.0
>
> loaded via a namespace (and not attached):
> [1] AnnotationDbi_1.24.0 biomaRt_2.18.0 Biostrings_2.30.1
> [4] biovizBase_1.10.7 bitops_1.0-6 BSgenome_1.30.0
> [7] cluster_1.14.4 colorspace_1.2-4 DBI_0.2-7
> [10] dichromat_2.0-0 Formula_1.1-1 GenomicFeatures_1.14.2
> [13] Hmisc_3.13-0 labeling_0.2 lattice_0.20-24
> [16] latticeExtra_0.6-26 munsell_0.4.2 plyr_1.8
> [19] RColorBrewer_1.0-5 RCurl_1.95-4.1 Rsamtools_1.14.2
> [22] RSQLite_0.11.4 scales_0.2.3 splines_3.0.2
> [25] stats4_3.0.2 stringr_0.6.2 survival_2.37-4
> [28] XML_3.98-1.1 zlibbioc_1.8.0
>
> Florian
>
>
>
>
>
> Hi,
>
> My genome is hg19 and I try to have the list of tables related to
> track "Broad ChromHMM ". I used the command line from rtracklayer
> to create the vector of potential tables.
>
> I put my sessionInfo(). Sorry to have forgotten to give you it.
>
> Also, I don't know if I am alone in this situation but when I read
> your documentation from web navigator (safari or firefox) or
> preview in mac OS X 10.6, I don't see the picture.
>
> Regards,
> Tiphaine
>
> On 17/02/14 08:31, Hahne, Florian wrote:
>> Hi Martin,
>> I'd really like to help, but from you code below I can't tell for
>> which genome you are trying to do this (the value of 'gen'). I
>> also have no idea how you came up with the 'track.name
>> <http://track.name>' vector.
>> In general, Gviz is using rtracklayer::tableNames to figure out
>> which tables and tracks are available. So you should get the same
>> results, assuming that you checked for the same chromosome.
>> Also please always include the output of sessionInfo() when
>> asking for help in order for us to know which R and package
>> version we are dealing with.
>> Florian
>>
>> From: <Martin>, Tiphaine <tiphaine.martin at kcl.ac.uk
>> <mailto:tiphaine.martin at kcl.ac.uk>>
>> Date: Thursday, February 13, 2014 11:18 PM
>> To: Florian Hahne <florian.hahne at novartis.com
>> <mailto:florian.hahne at novartis.com>>
>> Subject: Gviz problem to extract some track with UcscTrack like
>> Broad ChromHMM
>>
>> Dear Florian,
>>
>> I am trying to use your package Gviz to visualise my data and
>> some data from UCSC, for example: "Broad ChromHMM”. But I have a
>> error message.
>> I don’t understand because I have checked with the
>> package rtracklayer, whose your package inherits, the name of
>> tracks and tables. Could you help me ?
>>
>> Regards
>>
>> Tiph
>> R Command used to find the name of track and table
>> >track.names["Broad ChromHMM"]
>> Broad ChromHMM
>> “wgEncodeBroadHmm"
>>
>> >sapply(track, function(track) {
>> + tableNames(ucscTableQuery(mySession, track=track))
>> + })
>> Broad ChromHMM
>> [1,] "wgEncodeBroadHmmGm12878HMM"
>> [2,] "wgEncodeBroadHmmH1hescHMM"
>> [3,] "wgEncodeBroadHmmK562HMM"
>> [4,] "wgEncodeBroadHmmHepg2HMM"
>> [5,] "wgEncodeBroadHmmHuvecHMM"
>> [6,] "wgEncodeBroadHmmHmecHMM"
>> [7,] "wgEncodeBroadHmmHsmmHMM"
>> [8,] "wgEncodeBroadHmmNhekHMM"
>> [9,] "wgEncodeBroadHmmNhlfHMM"
>> > UcscTrack(genome = gen, chrom
>>
>> My errors when I try to use UcscTrack:
>> > UcscTrack(genome = gen, chromosome = chr, track = "Broad ChromHMM",
>> + table="wgEncodeBroadHmmGm12878HMM",
>> + from = start, to = end, trackType = "AnnotationTrack",
>> + rstarts = "chromStart", rends = "chromEnd", gene =
>> "name",
>> + symbol = "name", strand = "strand",
>> + fill = "itemRgb", name = "UCSC Genes”)
>>
>> Error in match.arg(track, sort(c(availTracks,
>> names(availTracks)))) :
>> 'arg' should be one of “1000G Ph1 Accsbl”, “1000G Ph1 Vars”,
>> “46-Way Cons”, “5% Lowest S”, “acembly”, “AceView Genes”, “Affy
>> Exon Array”, “Affy GNF1H”, “Affy RNA Loc”, “Affy U133”, “Affy
>> U133Plus2”, “Affy U95”, “affyExonArray”, “affyGnf1h”, “affyU133”,
>> “affyU133Plus2”, “affyU95”, “All SNPs(132)”, “All SNPs(135)”,
>> “All SNPs(137)”, “All SNPs(138)”, “Allen Brain”, “allenBrainAli”,
>> “allHg19RS_BW”, “altSeqComposite10”, “Assembly”, “BAC End Pairs”,
>> “bacEndPairs”, “Base Position”, “BU ORChID”, “Burge RNA-seq”,
>> “burgeRnaSeqGemMapperAlign”, “CCDS”, “ccdsGene”, “CD34 DnaseI”,
>> “CGAP SAGE”, “cgapSage”, “chainSelf”, “Chromosome Band”,
>> “clinvar”, “ClinVar Variants”, “Common SNPs(132)”, “Common
>> SNPs(135)”, “Common SNPs(137)”, “Common SNPs(138)”, “Cons Indels
>> MmCf”, “cons100way”, “cons46way”, “Conserv
>>
>>
>> > UcscTrack(genome = gen, chromosome = chr, track = "wgEncodeBroadHmm",
>> + table="wgEncodeBroadHmmGm12878HMM",
>> + from = start, to = end, trackType = "AnnotationTrack",
>> + rstarts = "chromStart", rends = "chromEnd", gene =
>> "name",
>> + symbol = "name", strand = "strand",
>> + fill = "itemRgb", name = "UCSC Genes”)
>>
>> Error in match.arg(track, sort(c(availTracks,
>> names(availTracks)))) :
>> 'arg' should be one of “1000G Ph1 Accsbl”, “1000G Ph1 Vars”,
>> “46-Way Cons”, “5% Lowest S”, “acembly”, “AceView Genes”, “Affy
>> Exon Array”, “Affy GNF1H”, “Affy RNA Loc”, “Affy U133”, “Affy
>> U133Plus2”, “Affy U95”, “affyExonArray”, “affyGnf1h”, “affyU133”,
>> “affyU133Plus2”, “affyU95”, “All SNPs(132)”, “All SNPs(135)”,
>> “All SNPs(137)”, “All SNPs(138)”, “Allen Brain”, “allenBrainAli”,
>> “allHg19RS_BW”, “altSeqComposite10”, “Assembly”, “BAC End Pairs”,
>> “bacEndPairs”, “Base Position”, “BU ORChID”, “Burge RNA-seq”,
>> “burgeRnaSeqGemMapperAlign”, “CCDS”, “ccdsGene”, “CD34 DnaseI”,
>> “CGAP SAGE”, “cgapSage”, “chainSelf”, “Chromosome Band”,
>> “clinvar”, “ClinVar Variants”, “Common SNPs(132)”, “Common
>> SNPs(135)”, “Common SNPs(137)”, “Common SNPs(138)”, “Cons Indels
>> MmCf”, “cons100way”, “cons46way”, “Conserv
>
>
> --
> ----------------------------
> Tiphaine Martin
> PhD Research Student | King's College
> The Department of Twin Research & Genetic Epidemiology
> Genetics & Molecular Medicine Division
> St Thomas' Hospital
> 4th Floor, Block D,
> South Wing
> SE1 7EH London
> United Kingdom
>
> email :tiphaine.martin at kcl.ac.uk <mailto:tiphaine.martin at kcl.ac.uk>
> Fax:+44 (0) 207 188 6761 <tel:%2B44%20%280%29%20207%20188%206761>
>
>
--
----------------------------
Tiphaine Martin
PhD Research Student | King's College
The Department of Twin Research & Genetic Epidemiology
Genetics & Molecular Medicine Division
St Thomas' Hospital
4th Floor, Block D,
South Wing
SE1 7EH London
United Kingdom
email :tiphaine.martin at kcl.ac.uk
Fax: +44 (0) 207 188 6761
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Gviz_collapseOption.png
Type: image/png
Size: 17165 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140221/7601ebea/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: GViz_NOcollapseOption.png
Type: image/png
Size: 40243 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140221/7601ebea/attachment-0001.png>
More information about the Bioconductor
mailing list