[R] Batch replacement, by factor, of values in a data frame
Phil Spector
spector at stat.berkeley.edu
Wed Aug 26 17:06:48 CEST 2009
The ave function is very handy for things like this:
mins = ave(D$Var,D$Site,FUN=function(x)min(x[x>0],na.rm=TRUE))
D$Var = ifelse(is.na(D$Var) | D$Var == 0,mins,D$Var)
should do the required replacements.
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spector at stat.berkeley.edu
On Wed, 26 Aug 2009, Gavin Simpson wrote:
> Dear List,
>
> I'm wondering if there is a better/cleaner/more efficient way of
> replacing 0 values in a variable with the minimum of the non-missing and
> non-zero values of that same variable, but doing it within the levels of
> a factor?
>
> Consider the dummy example data presented at the end of my message.
> Within each 'Site' there are some 0 values and possibly some NA's. I can
> compute the minimum of the non-missing and non-zero values by 'Site' as
> indicated below using aggregate for example. Save for looping over the
> 'Site's and replacing 0's with the relevant minimum is there a way of
> using a vectorised approach to do the replacement?
>
> Thanks in advance,
>
> G
>
> ## dummy data
> set.seed(123)
> D <- data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
> Var = runif(5*10))
> D <- D[with(D, order(Site, Var)), ]
> ## simulate some 0's
> D[c(1,3,11,12,23,27,34,36,41,49), "Var"] <- 0
> ## just to complicate matters, some NA
> D[sample(NROW(D), 3), "Var"] <- NA
> head(D)
> ## Compute minimums per Site
> aggregate(D$Var, by = list(Site = D$Site),
> FUN = function(x) min(x[x>0], na.rm = TRUE))
> ## How replace the appropriate 0's with the appropriate minimum?
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Dr. Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC, UCL Geography, [f] +44 (0)20 7679 0565
> Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
> UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list