[Rd] sweep sanity checking?
Petr Savicky
savicky at cs.cas.cz
Fri Jul 13 22:37:36 CEST 2007
I would like to suggest a replacement for the curent function
sweep based on the two previous suggestions posted at
https://stat.ethz.ch/pipermail/r-help/2005-June/073989.html
and
http://wiki.r-project.org/rwiki/doku.php?id=rdoc:base:sweep
with some extensions.
My suggestion is to use one of the following two variants. They
differ in whether length(STATS) == 1 is allowed without a warning
in the stricter (default) check or not. I don't know, what is better.
sweep1 <- function (x, MARGIN, STATS, FUN = "-", check.margin=TRUE, ...)
{
FUN <- match.fun(FUN)
dims <- dim(x)
dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
if (length(MARGIN) < length(dimstat)) {
STATS <- drop(STATS)
dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
}
if (check.margin && length(STATS) > 1 &&
(length(dimstat)!=length(MARGIN) || any(dims[MARGIN]!=dimstat))) {
warning("length(STATS) or dim(STAT) do not match dim(x)[MARGIN]")
} else if (prod(dims[MARGIN]) %% length(STATS)!=0)
warning("prod(dim(x)[MARGIN]) is not a multiple of length(STATS)")
perm <- c(MARGIN, (1:length(dims))[-MARGIN])
FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
}
sweep2 <- function (x, MARGIN, STATS, FUN = "-", check.margin=TRUE, ...)
{
FUN <- match.fun(FUN)
dims <- dim(x)
dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
if (length(MARGIN) < length(dimstat)) {
STATS <- drop(STATS)
dimstat <- if (is.null(dim(STATS))) length(STATS) else dim(STATS)
}
if (check.margin &&
(length(dimstat)!=length(MARGIN) || any(dims[MARGIN]!=dimstat))) {
warning("length(STATS) or dim(STAT) do not match dim(x)[MARGIN]")
} else if (prod(dims[MARGIN]) %% length(STATS)!=0)
warning("prod(dim(x)[MARGIN]) is not a multiple of length(STATS)")
perm <- c(MARGIN, (1:length(dims))[-MARGIN])
FUN(x, aperm(array(STATS, dims[perm]), order(perm)), ...)
}
The functions are tested on the following examples.
a <- array(1:64,dim=c(4,4,4))
M <- c(1,3);
b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)
a[,2,] - - - -
a[1:2,2,] warning warning - -
1 - warning - -
1:3 warning warning warning warning
1:16 warning warning - -
a <- matrix(1:8,nrow=4,ncol=2);
M <- 1;
b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)
1 - warning - -
1:2 warning warning - -
1:3 warning warning warning warning
1:4 - - - -
matrix(1:4,nrow=1) # (A)
- - - -
matrix(1:4,ncol=1) # (B)
- - - -
a <- matrix(1:8,nrow=4,ncol=2);
M <- 2;
b sweep1(a,M,b) sweep2(a,M,b) sweep1(a,M,b,c=F) sweep2(a,M,b,c=F)
1 - warning - -
1:2 - - - -
1:3 warning warning warning warning
1:4 warning warning warning warning
matrix(1:2,ncol=1) # (A)
- - - -
matrix(1:2,nrow=1) # (B)
- - - -
Note that cases marked (A) do not generate a warning, although they
should. This is the cost of using drop(STATS), which allows cases
marked (B) without a warning. In my opinion, cases (B) make sense.
Reproducing the tests is possible using the script
http://www.cs.cas.cz/~savicky/R-devel/verify_sweep.R
Petr.
More information about the R-devel
mailing list