[R] Subsetting dataframe by the nearest values of a vector elements
Harun Rashid
mhrashidbau at yahoo.com
Tue Nov 10 09:39:12 CET 2015
HI Jean,
Here is part of my data. As you can see, I have cross-section point and
corresponding elevation of a river. Now I want to select cross-section
points by 50m interval. But the real cross-section data might not have
exact points say 0, 50, 100,…and so on. Therefore, I need to take points
closest to those values.
cross_section elevation
1: 5.608 12.765
2: 11.694 10.919
3: 14.784 10.274
4: 20.437 7.949
5: 22.406 7.180
101: 594.255 7.710
102: 595.957 7.717
103: 597.144 7.495
104: 615.925 7.513
105: 615.890 7.751
I checked for some suggestions [particularly here
<http://stackoverflow.com/questions/20133344/find-closest-value-in-a-vector-with-binary-search>]
and finally did like this.
intervals <- c(5,50,100,150,200,250,300,350,400,450,500,550,600)
dt = data.table(real.val = w$cross_section, w)
setattr(dt,’sorted’,’cross_section’)
dt[J(intervals), roll = “nearest”]
And it gave me what I wanted.
dt[J(intervals), roll = “nearest”]
cross_section real.val elevation
1: 5 5.608 12.765
2: 50 49.535 6.744
3: 100 115.614 8.026
4: 150 152.029 7.206
5: 200 198.201 6.417
6: 250 247.855 4.497
7: 300 298.450 11.299
8: 350 352.473 11.534
9: 400 401.287 10.550
10: 450 447.768 9.371
11: 500 501.284 8.984
12: 550 550.650 16.488
13: 600 597.144 7.495
I don’t know whether there is a smarter to accomplish this!
Thanks in advance.
Regards,
Harun
On 11/10/15 11:17 AM, David Winsemius wrote:
>> On Nov 9, 2015, at 9:19 AM, Adams, Jean <jvadams at usgs.gov> wrote:
>>
>> Harun,
>>
>> Can you give a simple example?
>>
>> If your cross_section looked like this
>> c(144, 179, 214, 39, 284, 109, 74, 4, 249)
>> and your other vector looked like this
>> c(0, 50, 100, 150, 200, 250, 300, 350)
>> what would you want your subset to look like?
>>
>> Jean
>>
>> On Mon, Nov 9, 2015 at 7:26 AM, Harun Rashid via R-help <
>> r-help at r-project.org> wrote:
>>
>>> Hello,
>>> I have a dataset with two columns 1. cross_section (range: 0~635), and
>>> 2. elevation. The dataset has more than 100 rows. Now I want to make a
>>> subset on the condition that the 'cross_section' column will pick up the
>>> nearest cell from another vector (say 0, 50,100,150,200,.....,650).
>>> How can I do this? I would really appreciate a solution.
> If you what the "other vector" to define the “cell” boundaries, and using Jean’s example, it is a simple application of `findInterval`:
>
>> inp <- c(144, 179, 214, 39, 284, 109, 74, 4, 249)
>> mids <- c(0, 50, 100, 150, 200, 250, 300, 350)
>> findInterval( inp, c(mids) )
> [1] 3 4 5 1 6 3 2 1 5
>
> On the other hand ...
>
> To find the number of "closest point", this might help:
>
>
>> findInterval(inp, c( mids[1]-.001, head(mids,-1)+diff(mids)/2, tail(mids,1)+.001 ) )
> [1] 4 5 5 2 7 3 2 1 6
>
>
>
> —
> David Winsemius
> Alameda, CA, USA
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list