[R] how to merge within range?

William Dunlap wdunlap at tibco.com
Sat May 14 20:27:30 CEST 2011


You could use findInterval() along with a trick with c(rbind(...)):

> i <- findInterval(x=df.1$time, vec=c(rbind(df.2$from, df.2$to)))
> i
 [1] 1 1 1 2 3 3 3 5 5 6

The even-valued outputs would map to NA's, the odds
to value[(i+1)/2], but you can use the c(rbind(...)) trick again:

> c(rbind(df.2$value, NA))[i]
 [1]  1  1  1 NA  3  3  3  5  5 NA

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of René Mayer
> Sent: Saturday, May 14, 2011 11:06 AM
> To: David Winsemius
> Cc: r-help at r-project.org
> Subject: Re: [R] how to merge within range?
> 
> thanks David and Ian,
> let me make a better example as the first one was flawed
> 
> df.1=data.frame(round((1:10)*100+rnorm(10)), value=NA)
> names(df.1) = c("time", "value")
> df.1
>     time value
> 1   101    NA
> 2   199    NA
> 3   301    NA
> 4   401    NA
> 5   501    NA
> 6   601    NA
> 7   700    NA
> 8   800    NA
> 9   900    NA
> 10 1000    NA
> 
> # from and to define ranges within time,
> # note that from and to may not match the numbers given in time
> df.2=data.frame(from=c(99,500,799),to=c(303,702,950), value=c(1,3,5))
> df.2
>    from  to value
> 1   99 303     1
> 2  500 702     3
> 3  799 950     5
> 
> what I want is:
>     time value
> 1   101    1
> 2   199    1
> 3   301    1
> 4   401    NA
> 5   501    3
> 6   601    3
> 7   700    3
> 8   800    5
> 9   900    5
> 10 1000    NA
> 
> @David I don't know what you mean by 2 merges,
> René
> 
> 
> 
> 
> 
> Zitat von "David Winsemius" <dwinsemius at comcast.net>:
> 
> >
> > On May 14, 2011, at 9:16 AM, Ian Gow wrote:
> >
> >> If I assume that the third column in data.frame.2 is named 
> "val" then in
> >> SQL terms it _seems_ you want
> >>
> >> SELECT a.time, b.val FROM data.frame.1 AS a LEFT JOIN 
> data.frame.2 AS b ON
> >> a.time BETWEEN b.start AND b.end;
> >>
> >> Not sure how to do that elegantly using R subsetting/merge,
> >
> > Huh? It's just two merge()'s (... once you fix the error in 
> the example.)
> >
> > -- 
> > David
> >
> >> but you might
> >> try a package that allows you to use SQL, such as sqldf.
> >>
> >>
> >> On 5/14/11 8:03 AM, "David Winsemius" 
> <dwinsemius at comcast.net> wrote:
> >>
> >>>
> >>> On May 14, 2011, at 8:12 AM, René Mayer wrote:
> >>>
> >>>> Hello,
> >>>> how can one merge
> >>>
> >>> And what happened when you typed:
> >>>
> >>> ?merge
> >>>
> >>>> two data frames when in the second data frame one column 
> defines the
> >>>> start values
> >>>> and another defines the end value of the to be merged range.
> >>>> data.frame.1
> >>>> time ...
> >>>> 13
> >>>> 24
> >>>> 35
> >>>> 46
> >>>> 55
> >>>> ...
> >>>> data.frame.2
> >>>> start end
> >>>> 24 37 ?h? ?
> >>>> ...
> >>>>
> >>>> should result in this
> >>>> 13 NA
> >>>> 24 ?h?
> >>>> 35 ?h?
> >>>> 46 NA
> >>>> 55
> >>>> ?
> >>>
> >>> And _why_ would that be?
> >>>
> >>>
> >>>> thanks,
> >>>> René
> >>>>
> >>>> ______________________________________________
> >>>> R-help at r-project.org mailing list
> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>> PLEASE do read the posting guide
> >>>> http://www.R-project.org/posting-guide.html
> >>>> and provide commented, minimal, self-contained, 
> reproducible code.
> >>>
> >>> David Winsemius, MD
> >>> West Hartford, CT
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
> > David Winsemius, MD
> > West Hartford, CT
> >
> >
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



More information about the R-help mailing list