[R] repeated searching of no-missing values
Patrizio Frederic
frederic.patrizio at gmail.com
Wed Dec 10 23:09:06 CET 2008
hi all,
I have a data frame such as:
1 blue 0.3
1 NA 0.4
1 red NA
2 blue NA
2 green NA
2 blue NA
3 red 0.5
3 blue NA
3 NA 1.1
I wish to find the last non-missing value in every 3ple: ie I want a 3
by 3 data.frame such as:
1 red 0.4
2 blue NA
3 blue 1.1
I have written a little script
data = structure(list(V1 = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L
), V2 = structure(c(1L, NA, 3L, 1L, 2L, 1L, 3L, 1L, NA), .Label = c("blue",
"green", "red"), class = "factor"), V3 = c(0.3, 0.4, NA, NA,
NA, NA, 0.5, NA, 1.1)), .Names = c("V1", "V2", "V3"), class =
"data.frame", row.names = c(NA,
-9L))
cl = function(x) x[max(which(!is.na(x)))]
choose.last = function(x) tapply(x,x[,1],cl)
# now function choose.last works properly on numeric vectors:
> choose.last(data[,3])
1 2 3
0.4 NA 1.1
# but not on factors (I loose the factor labels):
> choose.last(data[,2])
1 2 3
3 1 1
# moreover, if I apply this function to the whole data.frame
# the output is a character matrix
> apply(data,2,choose.last)
V1 V2 V3
1 "1" "red" "0.4"
2 "2" "blue" NA
3 "3" "blue" "1.1"
# and if I sapply, I loose factors labels
> sapply(data,choose.last)
V1 V2 V3
1 1 3 0.4
2 2 1 NA
3 3 1 1.1
any hint?
Thanks in advance,
Patrizio
+-------------------------------------------------
| Patrizio Frederic, PhD
| Research associate in Statistics,
| Department of Economics,
| University of Modena and Reggio Emilia,
| Via Berengario 51,
| 41100 Modena, Italy
|
| tel: +39 059 205 6727
| fax: +39 059 205 6947
| mail: patrizio.frederic at unimore.it
+-------------------------------------------------
More information about the R-help
mailing list