[R] t-test via matrix operations

Thu Aug 15 16:03:27 CEST 2002

Eric Kort <Eric.Kort at vai.org> wrote:
> 2) How do I proceed to calculate the row variances with matrix
> operations (specifically, how can I use matrix operations to subtract each
> element of each row from the corresponding row mean)?

Brian D. Ripley <ripley at stats.ox.ac.uk> replied:
> sweep does the subtraction, but I would make use of rowmeans(d^2) and
> rowmeans(d) in the classic hand-calculation formula.

Here are four useful functions that build on rowMeans and friends:

colVars <- function(x, na.rm=FALSE, dims=1, unbiased=TRUE, SumSquares=FALSE,
                    twopass=FALSE) {
  if (SumSquares) return(colSums(x^2, na.rm, dims))
  N <- colSums(!is.na(x), FALSE, dims)
  Nm1 <- if (unbiased) N-1 else N
  if (twopass) {x <- if (dims==length(dim(x))) x - mean(x, na.rm=na.rm) else
                     sweep(x, (dims+1):length(dim(x)), colMeans(x,na.rm,dims))}
  (colSums(x^2, na.rm, dims) - colSums(x, na.rm, dims)^2/N) / Nm1
}

rowVars <- function(x, na.rm=FALSE, dims=1, unbiased=TRUE, SumSquares=FALSE,
                    twopass=FALSE) {
  if (SumSquares) return(rowSums(x^2, na.rm, dims))
  N <- rowSums(!is.na(x), FALSE, dims)
  Nm1 <- if (unbiased) N-1 else N
  if (twopass) {x <- if (dims==0) x - mean(x, na.rm=na.rm) else
                     sweep(x, 1:dims, rowMeans(x,na.rm,dims))}
  (rowSums(x^2, na.rm, dims) - rowSums(x, na.rm, dims)^2/N) / Nm1
}

colStdevs <- function(x, ...) sqrt(colVars(x, ...))

rowStdevs <- function(x, ...) sqrt(rowVars(x, ...))

-----------------------------------------------------------------------------

And in case any core member is inspired to put these into base, here's the .Rd:

\name{colVars}
\alias{colVars}
\alias{rowVars}
\alias{colStdevs}
\alias{rowStdevs}
\title{Column and Row Variances}
\description{Variances, or standard deviations by column (or row) of an array.}
\usage{
  colVars (x, na.rm=FALSE, dims=1, unbiased=TRUE, SumSquares=FALSE,
           twopass=FALSE)
  rowVars (x, na.rm=FALSE, dims=1, unbiased=TRUE, SumSquares=FALSE,
           twopass=FALSE)
  colStdevs(x, ...)
  rowStdevs(x, ...)
}
\arguments{
  \item{x}{A numeric array (or a dataframe to convert to a matrix).}
  \item{na.rm}{Logical: Remove NA's?}
  \item{dims}{Number of dimensions to sum over [colSums] or leave alone
              [rowSums].  Only useful when x is a multidimensional array.}
  \item{unbiased}{Logical: Use (N-1) in the denominator when calculating
                  variance?}
  \item{SumSquares}{Logical: If TRUE, colVars just returns sums of squares.}
  \item{twopass}{Logical: If TRUE, colVars uses the corrected two-pass
     algorithm of Chan Golub & LeVeque, which is slower but less subject
     to roundoff error.}
  \item{...}{colStdevs/rowStdevs take the same arguments as colVars/rowVars.}
}
\details{On a matrix:

  colVars (x, na.rm) == apply(x, 2, var,  na.rm=na.rm)

  rowVars (x, na.rm) == apply(x, 1, var,  na.rm=na.rm)
}
\value{
  A vector or array with dimensionality length(dim(x))-dims [colVars] or
  dims [rowVars].  Dimnames of the remaining dimensions are preserved.
  When the result is 1-dimensional, it is always demoted to a vector.
}
\author{Originally by Douglas Bates <bates at stat.wisc.edu> as package
       "MatUtils".  Modified, expanded, and renamed by David Brahm
       <brahm at alum.mit.edu>, with help of course from the R-help gurus.}
\seealso{\code{\link{apply}}}
\examples{
x <- matrix(1:12, 3,4, dimnames=list(letters[1:3], LETTERS[1:4]))
x[2,2] <- NA
colVars(x)
}
\keyword{array}
\keyword{arith}

-- 
                              -- David Brahm (brahm at alum.mit.edu)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._