[R] time-series data and time-invariant missing values
Gabor Grothendieck
ggrothendieck at gmail.com
Mon Apr 6 13:52:39 CEST 2009
Check out na.locf in the zoo package. Here we fill in
NAs going forward and just in case there were NAs
right at the beginning we fill them in backward as well.
library(zoo)
x <- as.Date(c(NA, "2000-01-01", NA))
x2 <- na.locf(x, na.rm = FALSE)
x2 <- na.locf(x2, fromLast = TRUE, na.rm = FALSE)
gives:
> x2
[1] "2000-01-01" "2000-01-01" "2000-01-01"
On Mon, Apr 6, 2009 at 7:13 AM, Kunzler, Andreas <a.kunzler at bzaek.de> wrote:
> Dear list,
>
> I have some problems with time-series data and missing values of time-invariant informations like sex or the birth-date.
>
> Assume a data (d) structure like
>
> id birth sex year of observation
> 1 NA NA 2006
> 1 1976-01-01 male 2007
> 1 NA NA 2008
>
> I am looking for a way to replace the missing values.
>
> Right know my answer to this problem slows down R
>
>
>
> for (i in 1:length(d[,1])){ # for all observations
>
> if (is.na(d$birth)[i])==F){ # Check if birth of observation(i) is missing
> d$birth_2[i] <- as.Date(birth[i],"%d.%m.%Y")
> }else{
> d$birth2[i] <- d$birth[id[i]==d$id & is.na(d$birth)==F],"%d.%m.%Y")[1] # if birth of observation (i) is missing, take a observation of another year
> }
> }
> }
>
> Result:
>
>
> id birth sex year of observation birth2
> 1 NA NA 2006 1976-01-01
> 1 01.01.1976 male 2007 1976-01-01
> 1 NA NA 2008 1976-01-01
>
> unfortunately the data consists of over 20000 observations a year.
>
> Does anybody know a better way?
>
> Thanks
>
> Mit freundlichen Grüßen
>
> Andreas Kunzler
> ____________________________
> Bundeszahnärztekammer (BZÄK)
> Chausseestraße 13
> 10115 Berlin
>
> Tel.: 030 40005-113
> Fax: 030 40005-119
>
> E-Mail: a.kunzler at bzaek.de
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list