[Rd] Suggestions for enhanced routines for "mlm" models.

Fri Feb 18 11:57:08 CET 2005

Dear R-devel'ers

Below is an outline for a set of routines to improve support for
multivariate linear models and "classical" repeated measurements
analysis. Nothing has been coded yet, so everything is subject to
change as loose ideas get confronted by the harsh realities of
programming.

Comments are welcome. They might even influence the implementation...

        -pd

General considerations:

	- S3 class based to fit existing code
        - similar to lm/glm code

fit <- lm(Y~...) creates basic mlm object (already does)

SSD(obj) returns object of class "SSD": 
           $SSD  matrix of sums of squares  & products 
           $df   degrees of freedom. 
         Methods for (lm and) mlm.

estVar(obj) obj$SSD/obj$df (could have methods for lm/mlm too)

Convenience functions:
  Tr is the trace operator sum(diag(M))
  proj is the projection operator possibly generalized to matrices.
  Rg: matrix rank (not sure we really need it, but see below)

mauchley.test(obj, M=diag(ncol=p), T = proj(X, orth=TRUE),
              X = matrix(rep(1,p)))
    (p = ncol(obj$SSD))

    Test of sphericity, i.e. that the obj represents a empirical
    covariance matrix S with true value proportional to M or that TST'
    is proportional to TMT'. Alternatively, give X with the property
    that TX == 0. (One sticky bit is that you can't really just use
    proj() because T  must have maximal rank. What is the current best
    practice for dealing with that?  qr() pivoting?)

summary.mlm
print.summary.mlm
vcov.mlm

    summary.mlm could be a little smarter than just coordinatewise
    summary.lm. It could at least provide the estimated residual
    covariance matrix (or SSD structure).  

    vcov currently inherits from "lm" leading to a completely
    arbitrarily scaled matrix. The correct matrix is a Kronecker
    product of the unscaled covariance matrix and estVar.

anova.mlm, anova.mlmlist, drop1.mlm, add1.mlm 

    These can (seemingly...) be obtained by relatively small
    modifications of their lm counterparts. The actual test
    calculations need to be excised from summary.manova (generalize?
    e.g. mvlin.test(SSD1, SSD2, method="Pillai")). It should be possible
    to wedge in tests under sphericity assumptions (with
    Greenhouse-Geisser and Huynh-Feldt corrections), as well as
    transformation/conditioning matrices (see below). 

    A more radical idea is to say that these are all different kinds
    of MANOVA tables and extend the *.manova functions to understand
    them. This would break the symmetry with lm, though, since you
    then need to use summary() to get a meaningful listing.

Notes on conditional tests (lower priority):

    Consider Y = (Y1,Y2) and a linear hypothesis matrix H.

    The test of zero intercept in the (multivariate) regression of H
    %*% Y2 on H %*% Y1 is of some interest, possibly after a linear
    transformation of Y. This is the test for additional information,
    but it is also the correct (ML) way of utilizing known-zero
    effects, e.g. pretreatment measurements. There is even a neat
    trick to fitting a linear structure across the responses by
    regressing on null-space contrasts.

    To some extent, conditional tests can be handled just by moving
    variables to the r.h.s. of the linear model specification, but
    there might be a point in having a more evocative interface,
    especially where transformed Y's are involved. This could be
    formula-based or matrix-based: contrasts=ginv(contr.sdif(4)) or
    formula based: ystruct=~index.

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907