[Rd] [R] computing the variance
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Mon Dec 5 20:33:16 CET 2005
Martin Maechler <maechler at stat.math.ethz.ch> writes:
> It seems Insightful at some point in time have given in to
> this user request, and S-plus nowadays has
> an argument "unbiased = TRUE"
> where the user can choose {to shoot (him/her)self in the leg and}
> require 'unbiased = FALSE'.
> {and there's also 'SumSquraes = FALSE' which allows to not
> require any division (by N or N-1)}
>
> Since in some ``schools of statistics'' people are really still
> taught to use a 1/N variance, we could envisage to provide such an
> argument to var() {and cov()} as well. Otherwise, people define
> their own variance function such as
> VAR <- function(x,....) .. N/(N-1)*var(x,...)
> Should we?
Using the biased variance just because it is the MLE (if that is the
argument) seems confused to me. However, there's another point:
> var(sample(1:3, 100000, replace=TRUE))
[1] 0.6680556
i.e. if we are considering x as the entire population, then the
variance when sampling from it is indeed 1/N*E(X-EX)^2, which is why
some presentations distinguish between the "population" and "sample"
variances. We might want to support this distinction somehow.
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list