[Rd] Suggestions for enhanced routines for "mlm" models.
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Fri Feb 18 11:57:08 CET 2005
Dear R-devel'ers
Below is an outline for a set of routines to improve support for
multivariate linear models and "classical" repeated measurements
analysis. Nothing has been coded yet, so everything is subject to
change as loose ideas get confronted by the harsh realities of
programming.
Comments are welcome. They might even influence the implementation...
-pd
General considerations:
- S3 class based to fit existing code
- similar to lm/glm code
fit <- lm(Y~...) creates basic mlm object (already does)
SSD(obj) returns object of class "SSD":
$SSD matrix of sums of squares & products
$df degrees of freedom.
Methods for (lm and) mlm.
estVar(obj) obj$SSD/obj$df (could have methods for lm/mlm too)
Convenience functions:
Tr is the trace operator sum(diag(M))
proj is the projection operator possibly generalized to matrices.
Rg: matrix rank (not sure we really need it, but see below)
mauchley.test(obj, M=diag(ncol=p), T = proj(X, orth=TRUE),
X = matrix(rep(1,p)))
(p = ncol(obj$SSD))
Test of sphericity, i.e. that the obj represents a empirical
covariance matrix S with true value proportional to M or that TST'
is proportional to TMT'. Alternatively, give X with the property
that TX == 0. (One sticky bit is that you can't really just use
proj() because T must have maximal rank. What is the current best
practice for dealing with that? qr() pivoting?)
summary.mlm
print.summary.mlm
vcov.mlm
summary.mlm could be a little smarter than just coordinatewise
summary.lm. It could at least provide the estimated residual
covariance matrix (or SSD structure).
vcov currently inherits from "lm" leading to a completely
arbitrarily scaled matrix. The correct matrix is a Kronecker
product of the unscaled covariance matrix and estVar.
anova.mlm, anova.mlmlist, drop1.mlm, add1.mlm
These can (seemingly...) be obtained by relatively small
modifications of their lm counterparts. The actual test
calculations need to be excised from summary.manova (generalize?
e.g. mvlin.test(SSD1, SSD2, method="Pillai")). It should be possible
to wedge in tests under sphericity assumptions (with
Greenhouse-Geisser and Huynh-Feldt corrections), as well as
transformation/conditioning matrices (see below).
A more radical idea is to say that these are all different kinds
of MANOVA tables and extend the *.manova functions to understand
them. This would break the symmetry with lm, though, since you
then need to use summary() to get a meaningful listing.
Notes on conditional tests (lower priority):
Consider Y = (Y1,Y2) and a linear hypothesis matrix H.
The test of zero intercept in the (multivariate) regression of H
%*% Y2 on H %*% Y1 is of some interest, possibly after a linear
transformation of Y. This is the test for additional information,
but it is also the correct (ML) way of utilizing known-zero
effects, e.g. pretreatment measurements. There is even a neat
trick to fitting a linear structure across the responses by
regressing on null-space contrasts.
To some extent, conditional tests can be handled just by moving
variables to the r.h.s. of the linear model specification, but
there might be a point in having a more evocative interface,
especially where transformed Y's are involved. This could be
formula-based or matrix-based: contrasts=ginv(contr.sdif(4)) or
formula based: ystruct=~index.
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list