[R] Recoding several variables into one use the most recent data
David Winsemius
dwinsemius at comcast.net
Mon Jun 27 19:48:46 CEST 2011
On Jun 27, 2011, at 12:56 PM, Christopher Desjardins wrote:
> Hi,
> I have the following data management issue. I am trying to combine
> multiple
> years of ethnicity data into one variable called ethnic. The data
> looks
> similar to the following
>
> id ethnic07 ethnic08 ethnic09 ethnic10
> 1 1 1 1 1
> 2 1 1 2 2
> 3 3 4 4 NA
> 4 2 3 NA NA
>
> So, what I'd like to do is create a variable called 'ethnic' and I'd
> like to
> have this variable be filled with the most recent data available. So
> ethnic10 would have the highest priority, then ethnic09, followed by
> ethnic08, and finally ethnic07. So the ethnic variable based on the
> data
> above would look like the following:
>
> ethnic
> 1
> 2
> 4
> 3
>
rd.txt <- function(txt, header=TRUE, ...) {
rd <- read.table(textConnection(txt), header=header, ...)
closeAllConnections()
rd }
> tail_noNA <- function(x,n) tail(x[!is.na(x)], n)
> tail_noNA(c(1,2,3,NA),1)
[1] 3
> dat <-rd.txt("id ethnic07 ethnic08 ethnic09 ethnic10
+ 1 1 1 1 1
+ 2 1 1 2 2
+ 3 3 4 4 NA
+ 4 2 3 NA NA")
> apply(dat, 1, tail_noNA, 1)
[1] 1 2 4 3
> as.matrix(apply(dat, 1, tail_noNA, 1))
[,1]
[1,] 1
[2,] 2
[3,] 4
[4,] 3
> I thought an ifelse() statement might work but I seem to be writing
> over my
> data every time I do this.
>
> Thanks,
> Chris
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list