[BioC] genome_intervals and GRange objects/HiTC package
Nicolas Servant
nservant at curie.fr
Thu Feb 7 16:08:56 CET 2013
Dear all,
Yes sorry for that, i didn't check the BioC list since a while.
Indead I made a lot of change in the HiTC package in devel branch.
Among the major changes, the package is now based on Genomic Ranges
instead of genomeIntervals.
The main reason of this change is an increase of the inter-operablility
of this package with the rest of the BioConductor project.
Since the 13.2 version, I now also distinguish the HTCexp objects,
basically a single interaction map, from the HTClist (a list of HTCexp)
object, basically useful for Hi-C data.
I often have some direct feedbacks from users, and according to their
needs, the package evolves quite rapidly. So please try to use the devel
version (v1.3.2, but the 1.3.3 should arrived soon) and do not hesitate
to contact if you have any question.
Nicolas
Le 07/02/2013 15:26, Valerie Obenchain a écrit :
> Herman,
>
> You must be using the release version of HiTC. Providing the output of
> sessionInfo() is also helpful when posting a question so others can
> see what versions you are using. I've put mine done so at the bottom
> of this post.
>
> The Genome_intervals class is present in the release version (1.2.0)
> but not in devel (1.3.2). The author must have decided to use GRanges
> objects instead (I've cc'd them on this email). In the mean time, here
> is an example of how to convert a 'Genome_intervals' to a 'GRanges'.
>
> Using the release version, there is an example of how to create object
> 'i' at the bottom of this man page,
>
> ?'Genome_intervals-class'
>
> > i
> Object of class Genome_intervals
> 2 base intervals and 2 inter-base intervals(*):
> chr01 [1, 2)
> chr01 [3, 5)
> chr02 [4, 6] *
> chr02 [8, 9) *
>
> annotation:
> seq_name inter_base
> 1 chr01 FALSE
> 2 chr01 FALSE
> 3 chr02 TRUE
> 4 chr02 TRUE
>
> You can create a GRanges from 'i' with the following code,
>
> > GRanges(annotation(i)$seq_name, IRanges(i at .Data[,1], i at .Data[,2]))
> GRanges with 4 ranges and 0 elementMetadata cols:
> seqnames ranges strand
> <Rle> <IRanges> <Rle>
> [1] chr01 [1, 2] *
> [2] chr01 [3, 5] *
> [3] chr02 [4, 6] *
> [4] chr02 [8, 9] *
> ---
> seqlengths:
> chr01 chr02
> NA NA
>
> > gr <- GRanges(annotation(i)$seq_name, IRanges(i at .Data[,1],
> i at .Data[,2]))
>
> You can add metadata with values() or elementMetadata(). These two
> functions have been replaced with mcols() in the devel branch.
> > values(gr) <- DataFrame("inter_base"=annotation(i)$inter_base)
> > gr
> GRanges with 4 ranges and 1 elementMetadata col:
> seqnames ranges strand | inter_base
> <Rle> <IRanges> <Rle> | <logical>
> [1] chr01 [1, 2] * | FALSE
> [2] chr01 [3, 5] * | FALSE
> [3] chr02 [4, 6] * | TRUE
> [4] chr02 [8, 9] * | TRUE
> ---
> seqlengths:
> chr01 chr02
> NA NA
>
>
> Valerie
>
> > sessionInfo()
> ...
>
> other attached packages:
> [1] HiTC_1.2.0 girafe_1.8.0 genomeIntervals_1.12.0
> [4] intervals_0.13.3 ShortRead_1.14.4 latticeExtra_0.6-24
> [7] RColorBrewer_1.0-5 lattice_0.20-10 Rsamtools_1.8.6
> [10] Biostrings_2.24.1 GenomicRanges_1.8.13 IRanges_1.14.4
> [13] BiocGenerics_0.2.0
>
> loaded via a namespace (and not attached):
> [1] BSgenome_1.24.0 Biobase_2.16.0 bitops_1.0-5 hwriter_1.3
> [5] stats4_2.15.1 tools_2.15.1 zlibbioc_1.2.0
>
>
> On 02/06/13 09:13, Valerie Obenchain wrote:
>> Hi Herman,
>>
>> When you post a question please provide code that reproduces the
>> problem. Your example does not show how 'hic.gr' was created or what
>> kind of object it is.
>>
>> library(HiTC)
>> data(Nora_5C)
>> intervals <- y_intervals(E14sub)
>>
>> The 'intervals' object is a GRanges.
>>
>> > class(intervals)
>> [1] "GRanges"
>> attr(,"package")
>> [1] "GenomicRanges"
>>
>> You can extract metadata colums with 'mcols'.
>>
>> head(mcols(intervals))
>>
>> > head(mcols(intervals))
>> DataFrame with 6 rows and 4 columns
>> name score itemRgb thick
>> <character> <numeric> <character> <IRanges>
>> 1 FOR_3 0 #0000FF [98834146, 98837506]
>> 2 FOR_5 0 #0000FF [98840772, 98841227]
>> 3 FOR_7 0 #0000FF [98843249, 98848364]
>> 4 FOR_9 0 #0000FF [98849664, 98850577]
>> 5 FOR_13 0 #0000FF [98862022, 98867230]
>> 6 FOR_15 0 #0000FF [98869143, 98870524]
>>
>> Also use 'mcols' to set the value of a metadata column.
>>
>> intervals$newData <- seq_along(intervals)
>> > head(mcols(intervals))
>> DataFrame with 6 rows and 5 columns
>> name score itemRgb thick newData
>> <character> <numeric> <character> <IRanges> <integer>
>> 1 FOR_3 0 #0000FF [98834146, 98837506] 1
>> 2 FOR_5 0 #0000FF [98840772, 98841227] 2
>> 3 FOR_7 0 #0000FF [98843249, 98848364] 3
>> 4 FOR_9 0 #0000FF [98849664, 98850577] 4
>> 5 FOR_13 0 #0000FF [98862022, 98867230] 5
>> 6 FOR_15 0 #0000FF [98869143, 98870524] 6
>>
>>
>> It looks like y_intervals() is returning a valid GRanges. Can you
>> provide an example of how you are getting a 'Genome_interval' object?
>>
>> Valerie
>>
>>
>>
>> On 02/05/2013 01:05 PM, Hermann Norpois wrote:
>>> Hello,
>>>
>>> in the documentation of HiTC package y_intervals () is described as a
>>> method to "return the ygi GRanges object defining the y intervals".
>>> I tried
>>> this for the test data (see dput) and expected a "classical" GRange
>>> object.
>>> For instance I would like to do an operation like
>>>
>>> mcols (hic.gr)$test<- 1
>>>
>>> But it did not work as hic.gr is a Genome_interval object (as dput
>>> mentioned). Can I transform this in a classical GR-object allowing
>>> mcols-operations? Can anybody comment on the difference between
>>> Genome_interval an Grange?
>>>
>>>
>>> Thanks
>>> Hermann
>>>
>>>
>>>
>>>> hic.gr
>>> Object of class Genome_intervals
>>> 5 base intervals and 0 inter-base intervals(*):
>>> chr14 [1, 999999]
>>> chr14 [1e+06, 1999999]
>>> chr14 [2e+06, 2999999]
>>> chr14 [3e+06, 3999999]
>>> chr14 [4e+06, 4999999]
>>>
>>> annotation:
>>> seq_name id inter_base
>>> 1 chr14 HIC_bin1 FALSE
>>> 2 chr14 HIC_bin2 FALSE
>>> 3 chr14 HIC_bin3 FALSE
>>> 4 chr14 HIC_bin4 FALSE
>>> 5 chr14 HIC_bin5 FALSE
>>>
>>>> dput (hic.gr)
>>> new("Genome_intervals"
>>> , .Data = structure(c(1, 1e+06, 2e+06, 3e+06, 4e+06, 999999,
>>> 1999999,
>>> 2999999,
>>> 3999999, 4999999), .Dim = c(5L, 2L))
>>> , annotation = structure(list(seq_name = structure(c(1L, 1L,
>>> 1L, 1L,
>>> 1L), .Label = "chr14", class = "factor"),
>>> id = structure(c(1L, 20L, 31L, 42L, 53L), .Label = c("HIC_bin1",
>>> "HIC_bin10", "HIC_bin100", "HIC_bin101", "HIC_bin102",
>>> "HIC_bin103",
>>> "HIC_bin104", "HIC_bin105", "HIC_bin106", "HIC_bin107",
>>> "HIC_bin11",
>>> "HIC_bin12", "HIC_bin13", "HIC_bin14", "HIC_bin15", "HIC_bin16",
>>> "HIC_bin17", "HIC_bin18", "HIC_bin19", "HIC_bin2", "HIC_bin20",
>>> "HIC_bin21", "HIC_bin22", "HIC_bin23", "HIC_bin24", "HIC_bin25",
>>> "HIC_bin26", "HIC_bin27", "HIC_bin28", "HIC_bin29", "HIC_bin3",
>>> "HIC_bin30", "HIC_bin31", "HIC_bin32", "HIC_bin33", "HIC_bin34",
>>> "HIC_bin35", "HIC_bin36", "HIC_bin37", "HIC_bin38", "HIC_bin39",
>>> "HIC_bin4", "HIC_bin40", "HIC_bin41", "HIC_bin42", "HIC_bin43",
>>> "HIC_bin44", "HIC_bin45", "HIC_bin46", "HIC_bin47", "HIC_bin48",
>>> "HIC_bin49", "HIC_bin5", "HIC_bin50", "HIC_bin51", "HIC_bin52",
>>> "HIC_bin53", "HIC_bin54", "HIC_bin55", "HIC_bin56", "HIC_bin57",
>>> "HIC_bin58", "HIC_bin59", "HIC_bin6", "HIC_bin60", "HIC_bin61",
>>> "HIC_bin62", "HIC_bin63", "HIC_bin64", "HIC_bin65", "HIC_bin66",
>>> "HIC_bin67", "HIC_bin68", "HIC_bin69", "HIC_bin7", "HIC_bin70",
>>> "HIC_bin71", "HIC_bin72", "HIC_bin73", "HIC_bin74", "HIC_bin75",
>>> "HIC_bin76", "HIC_bin77", "HIC_bin78", "HIC_bin79", "HIC_bin8",
>>> "HIC_bin80", "HIC_bin81", "HIC_bin82", "HIC_bin83", "HIC_bin84",
>>> "HIC_bin85", "HIC_bin86", "HIC_bin87", "HIC_bin88", "HIC_bin89",
>>> "HIC_bin9", "HIC_bin90", "HIC_bin91", "HIC_bin92", "HIC_bin93",
>>> "HIC_bin94", "HIC_bin95", "HIC_bin96", "HIC_bin97", "HIC_bin98",
>>> "HIC_bin99"), class = "factor"), inter_base = c(FALSE, FALSE,
>>> FALSE, FALSE, FALSE)), .Names = c("seq_name", "id", "inter_base"
>>> ), row.names = c(NA, 5L), class = "data.frame")
>>> , closed = structure(c(TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,
>>> TRUE, TRUE,
>>> TRUE,
>>> TRUE), .Dim = c(5L, 2L))
>>> , type = "Z"
>>> )
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Nicolas Servant
Plateforme de Bioinformatique
Unité 900 : Institut Curie - Inserm - Mines ParisTech
26, rue d'Ulm - 75248 Paris Cedex 05 - FRANCE
Email: Nicolas.Servant at curie.fr
Tel: 01 56 24 69 85
http://bioinfo.curie.fr/
More information about the Bioconductor
mailing list