[R] Efficiently calculate sd on an array?
Charles C. Berry
cberry at tajo.ucsd.edu
Sun Jun 17 18:56:31 CEST 2007
On Sun, 17 Jun 2007, Gavin Simpson wrote:
> Dear list,
>
> Consider the following problem:
>
> n.obs <- 167
> n.boot <- 100
> arr <- array(runif(n.obs*n.obs*n.boot), dim = c(n.obs, n.obs, n.boot))
> arr[sample(n.obs, 3), sample(n.obs, 3), ] <- NA
>
> Given the array arr, with dims = 167*167*100, I would like to calculate
> the sd of the values in the 3rd dimension of arr, and an obvious way to
> do this is via apply():
>
> system.time(res <- apply(arr, c(2,1), sd, na.rm = TRUE))
>
> This takes over 4 seconds on my desktop.
>
> I have found an efficient way to calculate the means of the 3rd
> dimension using
>
> temp <- t(rowMeans(arr, na.rm = TRUE, dims = 2))
>
> instead of
>
> temp <- apply(arr, c(2,1), mean, na.rm = TRUE)
>
> but I am having difficulty seeing how to calculate the standard
> deviations efficiently.
>
> Any idea how I might go about this?
Here are timings on my system:
> system.time(res <- apply(arr, c(2,1), sd, na.rm = TRUE))
user system elapsed
3.49 0.00 3.52
> system.time(res2 <- {
+ ns <- rowSums(!is.na(arr),dim=2)
+ mns <- as.vector(rowMeans(arr, na.rm = TRUE, dims = 2))
+ sds <- t(sqrt(rowSums( (arr- mns )^2,na.rm=T,dims=2)/as.vector(ns-1)))
+ sds[t(ns)==0] <- NA
+ sds})
user system elapsed
0.36 0.02 0.37
> all.equal(res,res2)
[1] TRUE
>
HTH,
Chuck
>
> All the best,
>
> G
> --
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> Gavin Simpson [t] +44 (0)20 7679 0522
> ECRC [f] +44 (0)20 7679 0565
> UCL Department of Geography
> Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk
> Gower Street
> London, UK [w] http://www.ucl.ac.uk/~ucfagls/
> WC1E 6BT [w] http://www.freshwaters.org.uk/
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list