[BioC] [devteam-bioc] apply function on genomicsRanges ob

Valerie Obenchain vobencha at fhcrc.org
Thu Jun 26 19:27:32 CEST 2014


A copy of your code is not a reproducible example. By 'reproducible' I 
mean code that others can copy from your email into an R session and it 
will run.

Please read the posting guide:
http://www.bioconductor.org/help/mailing-list/posting-guide/

If your question is about adding metadata columns to GRanges, create a 
small GRanges

   gr <- GRanges("chr1", IRanges(1:10, width=1))

or show a few lines of the GRanges read in with 
addconservedTFSInformation(). Use this small GRanges to demonstrate the 
loop you're having trouble with. Be clear about what your matching 
criteria are (simply overlap of ranges?, gene id?) and what metadata you 
are trying to add to the rows that match.


Valerie



On 06/26/2014 09:53 AM, gregory voisin wrote:
> Hi Valirie,
> this is  the code .
>
> I call my function:
>
> obj_build_list5 = mclapply(obj_build_list3, function(x)
> {addconservedTFSInformation(x)}  ,mc.cores =nbcores)
>
>
> addconservedTFSInformation<- function(annot_chr){#annot_chr is a gr object
>    #load the initial TFS data and create a GRanges Obj
>    conservedTFSFile  =
> as.data.frame(read.table("../FullAnnotation450K/Annotation450KBuilder/data/conserved_TFBS_sites_ucsc_JUNE2014.gtf"))
>    conservedTFS      = GRanges(seqnames = conservedTFSFile$V2, ranges=
> IRanges(conservedTFSFile$V3, conservedTFSFile$V4))
>    mcols(conservedTFS)= conservedTFSFile[,c(5,7,8)]
>    colnames(mcols(conservedTFS)) = c("TS_name","TS_strand","Z_score")
>    #add the metaColumn for the conservedTFS annotation
>    values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_name     = rep(NA, length(annot_chr))))
>    values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_position = rep(NA, length(annot_chr))))
>    values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_distance = rep(NA, length(annot_chr))))
>    values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_score    = rep(NA, length(annot_chr))))
>    values(annot_chr) <- cbind(values(annot_chr),
> DataFrame(conservedTFS_strand   = rep(NA, length(annot_chr))))
>    #add information for each CpG
>    for (j in 1:length(annot_chr)){
>      #find overlap
>      ov_current = nearest(annot_chr[j],conservedTFS)
>      conservedTFS_current= conservedTFS[ov_current]
>      if(!(length(ov_current)==0)){
>        mcols(annot_chr)[["conservedTFS_name"]][j]       =
> as.vector(mcols(conservedTFS_current[1])[["TS_name"]])
>        mcols(annot_chr)[["conservedTFS_position"]][j]   =
> start(ranges(conservedTFS_current[1]))
> mcols(annot_chr)[["conservedTFS_distance"]][j]   =
> start(ranges(conservedTFS_current[1]))- start(ranges(annot_chr[j]))
>        mcols(annot_chr)[["conservedTFS_score"]][j]      =
> as.vector(mcols(conservedTFS_current[1])[["Z_score"]])
>      }
> }
>    #save the object : one object by chromosome
>    nb= unlist(strsplit(chr, "chr"))[2]
>    name_obj = paste0(nb,".FullAnnotation450K_",nb, ".RData")
>    save (annot_chr, file = name_obj)
>    #return
> return(annot_chr)
> }
>
>
>
>
> Le Jeudi 26 juin 2014 12h04, Valerie Obenchain <vobencha at fhcrc.org> a
> écrit :
>
>
> Hi,
>
> I think you mean the GRanges class. GenomicRanges is a virtual class,
> GRanges is the concrete subclass.
>
> Please show a reproducable example of what you're trying to do. When you
> provide an example, instead of asking for a apply function for GRanges,
> others on the list can see what the end goal is and suggest
> alternatives. Using an *apply function may not be the best approach.
>
> Valerie
>
>
> On 06/26/2014 06:41 AM, Maintainer wrote:
>  > Hello,
>  >
>  > Does the apply function exist for genomisRange object. Here , I don't
> talk about a genomicRangesList object but genomic Range.
>  > Is it pertinent to implement it ?
>  >
>  > Actually, I populate my gr object with a for loops : depending the
> position of the gene , I had some information in mcol( gr obj).
>  > unsurprising, the for loop is totally unefficient.
>  >
>  > Greg.
>  > Lady Davis Institute
>  > Montreal
>  >
>  >  -- output of sessionInfo():
>  >
>  >> sessionInfo()
>  > R version 3.0.1 (2013-05-16)
>  > Platform: x86_64-redhat-linux-gnu (64-bit)
>  >
>  > locale:
>  >  [1] LC_CTYPE=en_US.UTF-8      LC_NUMERIC=C
>  >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  >  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>  >  [7] LC_PAPER=C                LC_NAME=C
>  >  [9] LC_ADDRESS=C              LC_TELEPHONE=C
>  > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>  >
>  > attached base packages:
>  > [1] parallel  stats    graphics  grDevices utils    datasets  methods
>  > [8] base
>  >
>  > other attached packages:
>  > [1] ade4_1.6-2          IRanges_1.20.7      BiocGenerics_0.10.0
>  >
>  > loaded via a namespace (and not attached):
>  > [1] stats4_3.0.1 tools_3.0.1
>  >
>  >
>  > --
>  > Sent via the guest posting facility at bioconductor.org.
>
>  >
>  > ________________________________________________________________________
>  > devteam-bioc mailing list
>  > To unsubscribe from this mailing list send a blank email to
>  > devteam-bioc-leave at lists.fhcrc.org
> <mailto:devteam-bioc-leave at lists.fhcrc.org>
>  > You can also unsubscribe or change your personal options at
>  > https://lists.fhcrc.org/mailman/listinfo/devteam-bioc
>  >
>
>
> --
> Valerie Obenchain
> Program in Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, Seattle, WA 98109
>
> Email: vobencha at fhcrc.org <mailto:vobencha at fhcrc.org>
> Phone: (206) 667-3158
>
>
>



More information about the Bioconductor mailing list