> GenomicRangesList() should be used here instead of GRangesList, for
efficiency, generality and perhaps semantics.

What makes GenomicRangesList() more general or efficient?  I did not
realize that I should be doing this.

Thanks,

--t


*He that would live in peace and at ease, *
*Must not speak all he knows, nor judge all he sees.*
*
*
Benjamin Franklin, Poor Richard's
Almanack<http://archive.org/details/poorrichardsalma00franrich>


On Thu, Oct 24, 2013 at 3:36 PM, Michael Lawrence <lawrence.michael@gene.com
> wrote:

>
>
>
> On Thu, Oct 24, 2013 at 2:54 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote:
>
>> ps.  Why +/- 100kb?  That's an awful lot of padding given that tons of
>> the genome falls into h3k4me1 peaks
>>
>>
>>
>> *He that would live in peace and at ease, *
>> *Must not speak all he knows, nor judge all he sees.*
>> *
>> *
>> Benjamin Franklin, Poor Richard's Almanack<http://archive.org/details/poorrichardsalma00franrich>
>>
>>
>> On Thu, Oct 24, 2013 at 2:52 PM, Tim Triche, Jr. <tim.triche@gmail.com>wrote:
>>
>>> If I'm guessing right, something like this... ?
>>>
>>> grset <- readRDS("grset.rds")
>>> show(grset)
>>> ##
>>> ## class: GenomicRatioSet
>>> ## dim: 468211 32
>>> ## exptData(0):
>>> ## assays(2): M CN
>>> ## ...
>>> ##
>>> highVar <- names(which(rowData(grset)$varByGroupQval < 0.05))
>>> ##
>>> ## about 50 probes, here
>>> ##
>>> ## could also use FDb.InfiniumMethylation.hg19 if not already mapped
>>>
>>> grow <- function(x, y) resize(x, width(x) + (2*y))
>>> probes <- grow(granges(grset)[highVar], 10e5) ## +/- 100kb
>>>
>>>
> This grow function is currently implemented as:
> granges(grset)[highVar] + 1e5
>
> If people like an alias like "grow" or "widen", we should consider adding
> it.
>
> require(AnnotationHub)
>>> hub = AnnotationHub()
>>> m = metadata(hub)
>>> ##
>>> ## ...time passes...
>>> ##
>>>
>>> histoneMarks <- c('k27ac','k4me1','k4me3')
>>> names(histoneMarks) <- histoneMarks
>>>
>>> pre <-
>>> 'goldenpath.hg19.encodeDCC.wgEncodeBroadHistone.wgEncodeBroadHistone'
>>> post <- 'StdPk.broadPeak_0.0.1.RData'
>>> gm12878 <- GRangesList(lapply(histoneMarks,
>>>                               function(x)
>>>                                 hub[[paste0(pre, 'Gm12878H3', x,
>>> post)]]))
>>>
>>>
> I kind of think that GenomicRangesList() should be used here instead of
> GRangesList, for efficiency, generality and perhaps semantics.
>
>
>>  lapply(gm12878, function(x) names(subsetByOverlaps(probes, x)))
>>> ## $k27ac
>>> ## [1] "cg07238657" "cg06431905" "cg14555649" "cg00031967" "cg10311020"
>>> ## ...
>>> ##
>>> ## $k4me1
>>> ## [1] "cg25243082" "cg06431905" "cg00031967" "cg10311020" "cg05482956"
>>> ## ...
>>> ##
>>> ## $k4me3
>>> ## [1] "cg16220844" "cg24991732" "cg07238657" "cg06431905" "cg14555649"
>>> ## ...
>>>
>>> Is that pretty similar to what you were thinking?  The rest will be an
>>> issue of hunt-and-peck; you could also use countOverlaps, though it won't
>>> make it as easy to e.g. intersect h3k27ac and h3k4me1 to find active
>>> enhancers.
>>>
>>> hope this helps,
>>>
>>> --t
>>>
>>>
>>>
>>> *He that would live in peace and at ease, *
>>> *Must not speak all he knows, nor judge all he sees.*
>>> *
>>> *
>>> Benjamin Franklin, Poor Richard's Almanack<http://archive.org/details/poorrichardsalma00franrich>
>>>
>>>
>>> On Thu, Oct 24, 2013 at 11:21 AM, khadeeja ismail <hajjja@yahoo.com>wrote:
>>>
>>>> Thanks much for the help. Will have a go and let you know.
>>>> I have about 80 probes, from many different genes. I'm not sure if they
>>>> can be summarized, but sure it's worth having a look.
>>>>
>>>> BR,
>>>> Khadeeja
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thursday, October 24, 2013 8:53 PM, Martin Morgan <
>>>> mtmorgan@fhcrc.org> wrote:
>>>>
>>>> On 10/24/2013 09:37 AM, khadeeja ismail wrote:
>>>> >
>>>> >
>>>> > Hi,
>>>> > I am working  with some 450k array probes which I need to look up in
>>>> Geneome browser to see in which type of areas these probes are located in.
>>>> For example, if the CpG site (+/- 100kb) overlaps with any of the following
>>>> in the GM12878 track.
>>>> >
>>>> >
>>>> > Layered H3K27Ac
>>>> > Layered H3K4Me1
>>>> > Layered H3K4Me3
>>>> > Transcription
>>>> > DNase Clusters
>>>> > DNase Clusters V1
>>>> > Txn Fac ChIP V3
>>>> > Txn Factor ChIP
>>>>
>>>> These tracks are available in AnnotationHub
>>>>
>>>>    library(AnnotationHub)
>>>>    hub = AnnotationHub()
>>>>    m = metadata(hub)
>>>>
>>>> and then
>>>>
>>>> > head(m$Description[grep("H3k27Ac", m$Description, ignore.case=TRUE)])
>>>> [1] "wgEncodeBroadHistoneHsmmtH3k27acStdPk"
>>>> [2] "wgEncodeBroadHistoneNhaH3k27acStdPk"
>>>> [3] "wgEncodeBroadHistoneA549H3k27acEtoh02Pk"
>>>> [4] "wgEncodeBroadHistoneK562H3k27acStdPk"
>>>> [5] "wgEncodeBroadHistoneGm12878H3k27acStdPk"
>>>> [6] "wgEncodeSydhHistoneMcf7H3k27acUcdPk"
>>>>
>>>> > xx =
>>>>
>>>> hub$goldenpath.hg19.encodeDCC.wgEncodeBroadHistone.wgEncodeBroadHistoneGm12878H3k27acStdPk.broadPeak_0.0.1.RData
>>>> Retrieving
>>>>
>>>> 'goldenpath/hg19/encodeDCC/wgEncodeBroadHistone/wgEncodeBroadHistoneGm12878H3k27acStdPk.broadPeak_0.0.1.RData'
>>>>
>>>> > head(xx)
>>>> GRanges with 6 ranges and 5 metadata columns:
>>>>        seqnames               ranges strand |        name     score
>>>> signalValue
>>>>           <Rle>            <IRanges>  <Rle> | <character> <integer>
>>>>  <numeric>
>>>>    [1]    chr22 [17091048, 17091199]      * |           .       579
>>>>  11.651761
>>>>    [2]    chr22 [17305774, 17306441]      * |           .       531
>>>>  10.111585
>>>>    [3]    chr22 [17517314, 17517945]      * |           .       527
>>>> 9.991400
>>>>    [4]    chr22 [17518132, 17518819]      * |           .       837
>>>>  19.847850
>>>>           pValue    qValue
>>>>        <numeric> <numeric>
>>>>    [1]       2.4        -1
>>>>    [2]      15.4        -1
>>>>    [3]     100.0        -1
>>>>    [4]      15.3        -1
>>>>   [ reached getOption("max.print") -- omitted 2 rows ]
>>>>
>>>> and then ready for findOverlaps or other GRanges operations. There's a
>>>> vignette
>>>> in AnnotationHub
>>>>
>>>>   http://bioconductor.org/packages/release/bioc/html/AnnotationHub.html
>>>>
>>>> and it is mentioned in the work flow on annotation and AnnotatingRanges
>>>> work
>>>> flows are relevant
>>>>
>>>>   http://bioconductor.org/help/workflows/annotation/annotation/
>>>>   http://bioconductor.org/help/workflows/annotation/AnnotatingRanges/
>>>>
>>>> It would be interesting and useful to have this as a stand-alone work
>>>> flow, so
>>>> if you do pursue this root and are interested in writing up a workflow
>>>> then let
>>>> me know...
>>>>
>>>> Martin
>>>>
>>>>
>>>> >
>>>> >
>>>> > I would like to do it as batch and not one by one since the list of
>>>> probes is long. I have tried querying the GenomeBrowser database and also
>>>> the rtracklayer package in R but have not been successful. Would be great
>>>> if anyone can give me any ideas on how it can be done.
>>>> >
>>>> > Thanking you,
>>>> > Khadeeja
>>>> >     [[alternative HTML version deleted]]
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Bioconductor mailing list
>>>> > Bioconductor@r-project.org
>>>> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> > Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>> >
>>>>
>>>>
>>>> --
>>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N.
>>>> PO Box 19024 Seattle, WA 98109
>>>>
>>>> Location: Arnold Building M1 B861
>>>> Phone: (206) 667-2793
>>>>         [[alternative HTML version deleted]]
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor@r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>>
>>>
>>
>

	[[alternative HTML version deleted]]

