On Thu, Jun 26, 2014 at 9:53 AM, gregory voisin <voisingreg@yahoo.fr> wrote:

> Hi Valirie,
> this is  the code .
>
> I call my function:
>
> obj_build_list5 = mclapply(obj_build_list3, function(x)
> {addconservedTFSInformation(x)}  ,mc.cores =nbcores)
>
>
> addconservedTFSInformation<- function(annot_chr){#annot_chr is a gr object
>
>
>   #load the initial TFS data and create a GRanges Obj
>   conservedTFSFile  = as.data.frame(read.table())
>   conservedTFS      = GRanges(seqnames = conservedTFSFile$V2, ranges=
> IRanges(conservedTFSFile$V3, conservedTFSFile$V4))
>   mcols(conservedTFS)= conservedTFSFile[,c(5,7,8)]
>   colnames(mcols(conservedTFS)) = c("TS_name","TS_strand","Z_score")
>


The above lines are turning a GFF file into a GRanges. This is a common
operation, so we have implemented it as:

conservedTFS <-
rtracklayer::import("../FullAnnotation450K/Annotation450KBuilder/data/conserved_TFBS_sites_ucsc_JUNE2014.gtf")



>
>
>   #add the metaColumn for the conservedTFS annotation
>   values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_name     = rep(NA, length(annot_chr))))
>   values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_position = rep(NA, length(annot_chr))))
>   values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_distance = rep(NA, length(annot_chr))))
>   values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_score    = rep(NA, length(annot_chr))))
>   values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_strand   = rep(NA, length(annot_chr))))
>
>   #add information for each CpG
>   for (j in 1:length(annot_chr)){
>     #find overlap
>     ov_current = nearest(annot_chr[j],conservedTFS)
>     conservedTFS_current= conservedTFS[ov_current]
>     if(!(length(ov_current)==0)){
>       mcols(annot_chr)[["conservedTFS_name"]][j]       =
> as.vector(mcols(conservedTFS_current[1])[["TS_name"]])
>       mcols(annot_chr)[["conservedTFS_position"]][j]   =
> start(ranges(conservedTFS_current[1]))
>       mcols(annot_chr)[["conservedTFS_distance"]][j]   =
> start(ranges(conservedTFS_current[1]))- start(ranges(annot_chr[j]))
>       mcols(annot_chr)[["conservedTFS_score"]][j]      =
> as.vector(mcols(conservedTFS_current[1])[["Z_score"]])
>     }
>   }
>

The nearest() function is vectorized, so you could do:
n <- nearest(annot_chr, conservedTFS)
To get the index of the nearest TFS to each annot_chr range, as a vector in
the same order as annot_chr.

Then merge the information like this:
annot_chr$conservedTFS_name <- conservedTFS$name[n]
annot_chr$conservedTFS_position <- start(conservedTFS)[n]
annot_chr$conservedTFS_distance <- abs(start(conservedTFS)[n] -
start(annot_chr))
annot_chr$conservedTFS_score <- conservedTFS$score[n]

Hope this helps get you started,
Michael


>   #save the object : one object by chromosome
>   nb= unlist(strsplit(chr, "chr"))[2]
>   name_obj = paste0(nb,".FullAnnotation450K_",nb, ".RData")
>   save (annot_chr, file = name_obj)
>
>   #return
>   return(annot_chr)
> }
>
>
>
>
> Le Jeudi 26 juin 2014 12h04, Valerie Obenchain <vobencha@fhcrc.org> a
> écrit :
>
>
>
> Hi,
>
> I think you mean the GRanges class. GenomicRanges is a virtual class,
> GRanges is the concrete subclass.
>
> Please show a reproducable example of what you're trying to do. When you
> provide an example, instead of asking for a apply function for GRanges,
> others on the list can see what the end goal is and suggest
> alternatives. Using an *apply function may not be the best approach.
>
> Valerie
>
>
>
> On 06/26/2014 06:41 AM, Maintainer wrote:
> > Hello,
> >
> > Does the apply function exist for genomisRange object. Here , I don't
> talk about a genomicRangesList object but genomic Range.
> > Is it pertinent to implement it ?
> >
> > Actually, I populate my gr object with a for loops : depending the
> position of the gene , I had some information in mcol( gr obj).
> > unsurprising, the for loop is totally unefficient.
> >
> > Greg.
> > Lady Davis Institute
> > Montreal
> >
> >   -- output of sessionInfo():
> >
> >>
>  sessionInfo()
> > R version 3.0.1 (2013-05-16)
> > Platform: x86_64-redhat-linux-gnu (64-bit)
> >
> > locale:
> >   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> >   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> >   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> >   [7] LC_PAPER=C                 LC_NAME=C
> >   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] parallel  stats     graphics  grDevices utils     datasets  methods
> > [8] base
> >
> > other attached packages:
> > [1] ade4_1.6-2          IRanges_1.20.7      BiocGenerics_0.10.0
> >
> > loaded via a namespace (and not attached):
> > [1] stats4_3.0.1 tools_3.0.1
> >
> >
> > --
> > Sent via the guest posting facility at bioconductor.org.
> >
> > ________________________________________________________________________
> > devteam-bioc mailing list
> > To unsubscribe from this mailing list send a blank email to
> > devteam-bioc-leave@lists.fhcrc.org
> > You can also unsubscribe or change your personal options at
> > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
> >
>
>
> --
> Valerie Obenchain
> Program in Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, Seattle, WA 98109
>
> Email: vobencha@fhcrc.org
> Phone: (206) 667-3158
>         [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor@r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>

	[[alternative HTML version deleted]]

