[Rd] Take marginality into account in factor.scope
ONKELINX, Thierry
Thierry.ONKELINX at inbo.be
Mon Apr 28 15:40:03 CEST 2008
Dear devellopeRs,
I've altered factor.scope() from the stats package so it can take
marginality into account. Therefore I've added the argument marginal.
When marginal is NULL, factor.scope() works like the original version.
The marginality is to be specified as a named list. The name of the
elements is the name of a variable. The content of the element is a
character vector with the variable(s) that must be in the model.
E.g. marginal = list(D = c("B", "C"), A3 = "A2", A2 = "A1")
Means that
- D can only enter the model if B and C are present.
- A2 can only enter the model if A1 is present.
- A3 can only enter the model if A2 is present. Since the list contains
A2 = "A1", this implies that A1 needs to be present too.
- B can only leave the model if D is not present.
- C can only leave the model if D is not present.
- A2 can only leave the model if A3 is not present.
- A1 can only leave the model if A2 and A3 are not present.
It copes with interactions too.
A1:B needs
by default
A1 and B
due to marginality
none
A2:B needs
by default
A2 and B
due marginality
A1 and A1:B
Comments are welcome,
Cheers,
Thierry
factor.scope <- function (factor, scope, marginal = NULL) {
drop <- scope$drop
add <- scope$add
if (length(factor) && !is.null(drop)) {
nmdrop <- colnames(drop)
facs <- factor
if (length(drop)) {
nmfac <- colnames(factor)
nmfac0 <- sapply(strsplit(nmfac, ":", fixed = TRUE),
function(x) paste(sort(x), collapse = ":"))
nmdrop0 <- sapply(strsplit(nmdrop, ":", fixed = TRUE),
function(x) paste(sort(x), collapse = ":"))
where <- match(nmdrop0, nmfac0, 0)
if (any(!where)){
stop(gettextf("lower scope has term(s) %s not included
in model", paste(sQuote(nmdrop[where == 0]), collapse = ", ")), domain
= NA)
}
facs <- factor[, -where, drop = FALSE]
nmdrop <- nmfac[-where]
} else {
nmdrop <- colnames(factor)
}
if (ncol(facs) > 1) {
keep <- rep.int(TRUE, ncol(facs))
f <- crossprod(facs > 0)
for (i in seq(keep)){
keep[i] <- max(f[i, -i]) != f[i, i]
}
nmdrop <- nmdrop[keep]
}
keep <- unlist(lapply(strsplit(nmdrop, ":", fixed = TRUE),
function(x){
whereMarginal <- match(x, names(marginal))
if(!all(is.na(whereMarginal))){
if(length(x) == 1){
marginal[[whereMarginal]]
} else {
lowerOrderTerms <- lapply(seq_along(whereMarginal),
function(z){
if(is.na(whereMarginal[z])){
x[z]
} else {
c(x[z], marginal[[whereMarginal[z]]])
}
})
lowerOrder <- apply(expand.grid(lowerOrderTerms), 1,
function(z){
paste(sort(z), collapse = ":")
})
lowerOrder[!lowerOrder %in% paste(sort(x), collapse
= ":")]
}
}
}))
nmdrop <- nmdrop[!nmdrop %in% keep]
} else {
nmdrop <- character(0)
}
if (!length(add)){
nmadd <- character(0)
} else {
nmfac <- colnames(factor)
nmadd <- colnames(add)
if (!is.null(nmfac)) {
nmfac0 <- sapply(strsplit(nmfac, ":", fixed = TRUE),
function(x) paste(sort(x), collapse = ":"))
nmadd0 <- sapply(strsplit(nmadd, ":", fixed = TRUE),
function(x) paste(sort(x), collapse = ":"))
where <- match(nmfac0, nmadd0, 0)
if (any(!where)){
stop(gettextf("upper scope does not include model
term(s) %s", paste(sQuote(nmfac[where == 0]), collapse = ", ")), domain
= NA)
}
nmadd <- nmadd[-where]
add <- add[, -where, drop = FALSE]
}
if (ncol(add) > 1) {
keep <- rep.int(TRUE, ncol(add))
f <- crossprod(add > 0)
for (i in seq(keep)){
keep[-i] <- keep[-i] & (f[i, -i] < f[i, i])
}
nmadd <- nmadd[keep]
}
if(length(nmadd)){
keep <- sapply(strsplit(nmadd, ":", fixed = TRUE),
function(x){
whereMarginal <- match(x, names(marginal))
if(!all(is.na(whereMarginal))){
if(length(x) == 1){
all(!marginal[[whereMarginal]] %in% nmadd)
} else {
lowerOrderTerms <-
lapply(seq_along(whereMarginal), function(z){
if(is.na(whereMarginal[z])){
x[z]
} else {
c(x[z], marginal[[whereMarginal[z]]])
}
})
lowerOrder <-
apply(expand.grid(lowerOrderTerms), 1, function(z){
paste(sort(z), collapse = ":")
})
lowerOrder <- lowerOrder[!lowerOrder %in%
paste(sort(x), collapse = ":")]
all(!lowerOrder %in% nmadd)
}
} else {
TRUE
}
})
nmadd <- nmadd[keep]
}
}
list(drop = nmdrop, add = nmadd)
}
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
More information about the R-devel
mailing list