# [R] How to speed up interpolation

James Rome jamesrome at gmail.com
Sun Jul 17 19:30:21 CEST 2011

```df is a very large data frame with arrival estimates for many flights
(DF\$flightfact) at random times (df\$PredTime). The error of the estimate
is df\$dt.
My problem is that I want to know the prediction error at each minute
before landing. This code works, but is very slow, and dominates
everything. I tried using split(), but that rapidly ate up my 12 GB of
memory. So, is there a better R way of doing this?

Thanks,
Jim Rome

flights = table(df\$flightfact[1:dim(df)[1], drop=TRUE])
nflights = length(flights)
flights = as.data.frame(flights)
times = data.frame()
# Split by flight
for(i in 1:nflights) {
tf = df[as.numeric(df\$flightfact)==flights[i,1],]    # This flight
#check for at least 2 entries
if(dim(tf)[1] < 2) {
next
}
idf = interpolateTimes(tf)
times = rbind(times, idf)
}

# Interpolate the times to every minute for 60 minutes
# Return a new data frame
interpolateTimes = function(df) {
x = as.numeric(seq(from=0,to=60)) # The times to interpolate to
dti = approx(as.numeric(df\$PredTime), as.numeric(df\$dt), x,
method="linear",rule=1:1)
# Make a new data frame of interpolated values
idf = data.frame(time=dti\$x, error=dti\$y,
runway=rep(df\$lrw[1],length(dti\$x)),
flight=rep(df\$flightfact[1], length(dti\$x)))
return(idf)
}
```