[R] Merge by Range in R

jim holtman jholtman at gmail.com
Mon Sep 4 20:37:53 CEST 2017


Have you tried 'foverlaps' in the data.table package?


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Mon, Sep 4, 2017 at 8:31 AM, Mohammad Tanvir Ahamed via R-help <
r-help at r-project.org> wrote:

> Hi,
> I have two big data set.
>
> data _1 :
> > dim(data_1)
> [1] 15820 5
>
> > head(data_1)
>    Chromosome      Start        End        Feature GroupA_3
> 1:               chr1 521369  750000     chr1-0001        0.170
> 2:               chr1 750001  800000     chr1-0002       -0.086
> 3:               chr1 800001  850000     chr1-0003        0.006
> 4:               chr1 850001  900000     chr1-0004        0.050
> 5:               chr1 900001  950000     chr1-0005        0.062
> 6:               chr1 950001 1000000    chr1-0006       -0.016
>
> data_2:
> > dim(data_2)
> [1] 470870 5
>
> > head(data_2)
>    Chromosome     Start   End            Feature     GroupA_3
> 1:               chr1 15864 15865     cg13869341            0.207
> 2:               chr1 18826 18827     cg14008030           -0.288
> 3:               chr1 29406 29407     cg12045430           -0.331
> 4:               chr1 29424 29425     cg20826792           -0.074
> 5:               chr1 29434 29435     cg00381604            0.141
> 6:               chr1 68848 68849     cg20253340           -0.458
>
>
> What I want to do :
> Based on column name "Chromosome", "Start" and "End" of two data set ,   I
> want to find which row (preciously "Feature") of data_2 is in every range (
> between "Start" and "End") of data_1 ? Also "Chromosome" column element
> should be match between two data set.
>
> I have tried "GenomicRanges" packages describe in the post
> https://stackoverflow.com/questions/11892241/merge-by-
> range-in-r-applying-loops
> But i was not successful. Can any one please help me to do this fast, as
> the data is very big ?
> Thanks in advance.
>
>
> Regards.............
> Tanvir Ahamed Stockholm, Sweden     |  mashranga at yahoo.com
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list