[R] Batch replacement, by factor, of values in a data frame
Gavin Simpson
gavin.simpson at ucl.ac.uk
Wed Aug 26 16:52:32 CEST 2009
Dear List,
I'm wondering if there is a better/cleaner/more efficient way of
replacing 0 values in a variable with the minimum of the non-missing and
non-zero values of that same variable, but doing it within the levels of
a factor?
Consider the dummy example data presented at the end of my message.
Within each 'Site' there are some 0 values and possibly some NA's. I can
compute the minimum of the non-missing and non-zero values by 'Site' as
indicated below using aggregate for example. Save for looping over the
'Site's and replacing 0's with the relevant minimum is there a way of
using a vectorised approach to do the replacement?
Thanks in advance,
G
## dummy data
set.seed(123)
D <- data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
Var = runif(5*10))
D <- D[with(D, order(Site, Var)), ]
## simulate some 0's
D[c(1,3,11,12,23,27,34,36,41,49), "Var"] <- 0
## just to complicate matters, some NA
D[sample(NROW(D), 3), "Var"] <- NA
head(D)
## Compute minimums per Site
aggregate(D$Var, by = list(Site = D$Site),
FUN = function(x) min(x[x>0], na.rm = TRUE))
## How replace the appropriate 0's with the appropriate minimum?
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list