[R] Best way to do temporal joins in R?

Jonathan Greenberg greenberg at ucdavis.edu
Mon Mar 16 20:11:39 CET 2009


Sorry for the immediate follow-up, but Phil Spector correctly reminded 
me this is a lot easier for the community I provide some sample data, so 
I'm attaching 3 small CSVs to this email:

species_data_Rexample.csv contains the "field data" (which species was 
ID'd and what time it was ID'd),
temperature_data_Rexample.csv contains the date, time, station ID and 
the temperature "value"

I'd like a dataframe which contains for each unique line in 
species_data_Rexample.csv, a series of lines, one per station, and the 
temperature of the nearest time stamp, or an interpolated value 
(weighted average would be fine, but so would just grabbing the nearest 
value), so for this example I'd like something that looks like the csv 
"fused_data_Rexample.csv"

Thanks!

--j

Jonathan Greenberg wrote:
> I've been playing with zoo a bit, and it seems ok except it doesn't 
> support non-unique time stamps when performing joins.  I have two 
> databases which contain a dataframe of a Date object (with the time, 
> not just MM/DD/YY), e.g.:
>
> DB 1:
> UniqueID,Date1,Data 1,Data 2
>
> DB 2:
> Date2, Station, Data 3
>
> We'll say Station can contain three values: A,B and C
>
> DB 1 may have some repeat times, and DB 2 definitely has them -- 
> although each Date, Station combo is unique (this DB contains weather 
> data collected on the half-hour or fifteen minute interval at a set of 
> stations).  I'd like DB2's station and Data3 to be joined with DB1 
> based on the nearest time stamp (interpolating Data3 or not).
>
> Ideally, I'd like a fused database such that I get for each uniqueID 
> in DB1:
>
> UniqueID,Date,Data1,Data2,Station,Data3
>
> Thoughts?  Hints?
>
> --j
>
>
>


More information about the R-help mailing list