[R] Merge by Range in R
Mohammad Tanvir Ahamed
mashranga at yahoo.com
Mon Sep 4 14:31:17 CEST 2017
Hi,
I have two big data set.
data _1 :
> dim(data_1)
[1] 15820 5
> head(data_1)
Chromosome Start End Feature GroupA_3
1: chr1 521369 750000 chr1-0001 0.170
2: chr1 750001 800000 chr1-0002 -0.086
3: chr1 800001 850000 chr1-0003 0.006
4: chr1 850001 900000 chr1-0004 0.050
5: chr1 900001 950000 chr1-0005 0.062
6: chr1 950001 1000000 chr1-0006 -0.016
data_2:
> dim(data_2)
[1] 470870 5
> head(data_2)
Chromosome Start End Feature GroupA_3
1: chr1 15864 15865 cg13869341 0.207
2: chr1 18826 18827 cg14008030 -0.288
3: chr1 29406 29407 cg12045430 -0.331
4: chr1 29424 29425 cg20826792 -0.074
5: chr1 29434 29435 cg00381604 0.141
6: chr1 68848 68849 cg20253340 -0.458
What I want to do :
Based on column name "Chromosome", "Start" and "End" of two data set , I want to find which row (preciously "Feature") of data_2 is in every range ( between "Start" and "End") of data_1 ? Also "Chromosome" column element should be match between two data set.
I have tried "GenomicRanges" packages describe in the post
https://stackoverflow.com/questions/11892241/merge-by-range-in-r-applying-loops
But i was not successful. Can any one please help me to do this fast, as the data is very big ?
Thanks in advance.
Regards.............
Tanvir Ahamed Stockholm, Sweden | mashranga at yahoo.com
[[alternative HTML version deleted]]
More information about the R-help
mailing list