[R] Batch replacement, by factor, of values in a data frame

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Aug 26 16:52:32 CEST 2009


Dear List,

I'm wondering if there is a better/cleaner/more efficient way of
replacing 0 values in a variable with the minimum of the non-missing and
non-zero values of that same variable, but doing it within the levels of
a factor?

Consider the dummy example data presented at the end of my message.
Within each 'Site' there are some 0 values and possibly some NA's. I can
compute the minimum of the non-missing and non-zero values by 'Site' as
indicated below using aggregate for example. Save for looping over the
'Site's and replacing 0's with the relevant minimum is there a way of
using a vectorised approach to do the replacement?

Thanks in advance,

G

## dummy data
set.seed(123)
D <- data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
                Var = runif(5*10))
D <- D[with(D, order(Site, Var)), ]
## simulate some 0's
D[c(1,3,11,12,23,27,34,36,41,49), "Var"] <- 0
## just to complicate matters, some NA
D[sample(NROW(D), 3), "Var"] <- NA
head(D)
## Compute minimums per Site
aggregate(D$Var, by = list(Site = D$Site),
          FUN = function(x) min(x[x>0], na.rm = TRUE))
## How replace the appropriate 0's with the appropriate minimum?
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list