[R] Weighted variance function?
Gavin Simpson
gavin.simpson at ucl.ac.uk
Thu Jul 24 16:27:15 CEST 2008
On Thu, 2008-07-24 at 02:25 +0530, Arun Kumar Saha wrote:
> There is a R function to calculate weighted mean : weighted.mean() under
> stats package. Is there any direct R function for calculating weighted
> variance as well?
Here are two ways; weighted.var() is via the usual formula and
weighted.var2() uses a running sums approach. The formulae for which are
both on the weighted mean entry page on wikipedia for example.
The removal of NA is as per weighted.mean, but I have not included any
of the sanity checks that that functions contains.
weighted.var <- function(x, w, na.rm = FALSE) {
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
}
sum.w <- sum(w)
sum.w2 <- sum(w^2)
mean.w <- sum(x * w) / sum(w)
(sum.w / (sum.w^2 - sum.w2)) * sum(w * (x - mean.w)^2, na.rm =
na.rm)
}
weighted.var2 <- function(x, w, na.rm = FALSE) {
if (na.rm) {
w <- w[i <- !is.na(x)]
x <- x[i]
}
sum.w <- sum(w)
(sum(w*x^2) * sum.w - sum(w*x)^2) / (sum.w^2 - sum(w^2))
}
## from example section in ?weighted.mean
## GPA from Siegel 1994
wt <- c(5, 5, 4, 1)/15
x <- c(3.7,3.3,3.5,2.8)
weighted.mean(x,wt)
weighted.var(x, wt)
weighted.var2(x, wt)
And some timings:
> system.time(replicate(100000, weighted.var(x, wt)))
user system elapsed
2.679 0.014 2.820
> system.time(replicate(100000, weighted.var2(x, wt)))
user system elapsed
2.224 0.010 2.315
HTH
G
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list