[R] Filling in missing time samples with na.approx
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Nov 29 19:51:07 CET 2010
On Mon, Nov 29, 2010 at 1:33 PM, Jason Edgecombe
<jason at rampaginggeek.com> wrote:
> On 11/29/2010 10:00 AM, Gabor Grothendieck wrote:
>>
>> On Mon, Nov 29, 2010 at 9:45 AM, Jason Edgecombe
>> <jason at rampaginggeek.com> wrote:
>>
>>>
>>> Hi Everyone,
>>>
>>> I have a some data from a sports gps device like the following:
>>>
>>> time latitude longitude altitude distance heartrate
>>> 1 1277648884 0.304048 -0.793819 260 0.000000 94
>>> 2 1277648885 0.304056 -0.793772 262 4.307615 95
>>> 3 1277648888 0.304060 -0.793696 263 11.262347 97
>>> 4 1277648894 0.304075 -0.793544 263 25.237911 103
>>> 5 1277648898 0.304085 -0.793455 263 33.322525 108
>>> 6 1277648902 0.304064 -0.793387 256 40.042988 115
>>>
>>> As you can see, the samples have irregular holes in the time column. How
>>> can
>>> I fill in the missing samples using na.approx?
>>>
>>> I've tried to creating a blank series with no gaps and combine them, but
>>> "merge" just adds columns and "rbind" compains about duplicate indexes.
>>>
>>> P.S. My GPS still has holes in the data when I turn off "smart recording"
>>> :(
>>>
>>>
>>
>> Try this:
>>
>> Lines<- "time latitude longitude altitude distance heartrate
>> 1277648884 0.304048 -0.793819 260 0.000000 94
>> 1277648885 0.304056 -0.793772 262 4.307615 95
>> 1277648888 0.304060 -0.793696 263 11.262347 97
>> 1277648894 0.304075 -0.793544 263 25.237911 103
>> 1277648898 0.304085 -0.793455 263 33.322525 108
>> 1277648902 0.304064 -0.793387 256 40.042988 115"
>>
>> # read in data
>> library(zoo)
>> z<- read.zoo(textConnection(Lines), header = TRUE)
>>
>> na.approx(z, xout = seq(min(time(z)), max(time(z))))
>>
>>
>>
>>
>
> No change:
>> na.approx(z, xout = seq(min(time(z)), max(time(z))))
> latitude longitude altitude distance heartrate
> 1277648884 0.304048 -0.793819 260 0.000000 94
> 1277648885 0.304056 -0.793772 262 4.307615 95
> 1277648888 0.304060 -0.793696 263 11.262347 97
> 1277648894 0.304075 -0.793544 263 25.237911 103
> 1277648898 0.304085 -0.793455 263 33.322525 108
> 1277648902 0.304064 -0.793387 256 40.042988 115
>
It works for me.
> Lines <- "time latitude longitude altitude distance heartrate
+ 1277648884 0.304048 -0.793819 260 0.000000 94
+ 1277648885 0.304056 -0.793772 262 4.307615 95
+ 1277648888 0.304060 -0.793696 263 11.262347 97
+ 1277648894 0.304075 -0.793544 263 25.237911 103
+ 1277648898 0.304085 -0.793455 263 33.322525 108
+ 1277648902 0.304064 -0.793387 256 40.042988 115"
>
> # read in data
> library(zoo)
> z <- read.zoo(textConnection(Lines), header = TRUE)
>
> na.approx(z, xout = seq(min(time(z)), max(time(z))))
latitude longitude altitude distance heartrate
1277648884 0.3040480 -0.7938190 260.0000 0.000000 94.00000
1277648885 0.3040560 -0.7937720 262.0000 4.307615 95.00000
1277648886 0.3040573 -0.7937467 262.3333 6.625859 95.66667
1277648887 0.3040587 -0.7937213 262.6667 8.944103 96.33333
1277648888 0.3040600 -0.7936960 263.0000 11.262347 97.00000
1277648889 0.3040625 -0.7936707 263.0000 13.591608 98.00000
1277648890 0.3040650 -0.7936453 263.0000 15.920868 99.00000
1277648891 0.3040675 -0.7936200 263.0000 18.250129 100.00000
1277648892 0.3040700 -0.7935947 263.0000 20.579390 101.00000
1277648893 0.3040725 -0.7935693 263.0000 22.908650 102.00000
1277648894 0.3040750 -0.7935440 263.0000 25.237911 103.00000
1277648895 0.3040775 -0.7935218 263.0000 27.259065 104.25000
1277648896 0.3040800 -0.7934995 263.0000 29.280218 105.50000
1277648897 0.3040825 -0.7934773 263.0000 31.301371 106.75000
1277648898 0.3040850 -0.7934550 263.0000 33.322525 108.00000
1277648899 0.3040797 -0.7934380 261.2500 35.002641 109.75000
1277648900 0.3040745 -0.7934210 259.5000 36.682756 111.50000
1277648901 0.3040693 -0.7934040 257.7500 38.362872 113.25000
1277648902 0.3040640 -0.7933870 256.0000 40.042988 115.00000
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list