[R] lda on curves

Christian Hennig hennig at stat.math.ethz.ch
Mon Feb 17 13:59:54 CET 2003

```Hi,

I recently work about linear dimension reduction for classification.
There is a research report on
ftp://ftp.stat.math.ethz.ch/Research-Reports/108.html
In this report I discuss nine methods for linear dimension reduction, five
of which are new. Four of the methods do not perform "internal scaling" which
you want to avoid. Two of these have been published before by other
authors. The coordinates are
Young, Marco and Odell, Journal of Statistical Planning and Inference, 17
(1987), 307-319
and
Hastie and Tibshirani, IEEE Transactions on Pattern Analysis and Machine
Intelligence, 18 (1996), 607-616.
I have R functions for all the methods, but I don't want to make them open
before the corresponding paper is published. If you are interested, please
contact me off list.
There is more literature about "unscaled canonical variates" especially by
W. Krzanowski, two references are
Krzanowski, Journal of Chemometrics, 9 (1995), 509-520
Kiers and Krzanowski in Gaul, Opitz and Schader (Eds.) Data Analysis,
Springer, Berlin 2000, 207-218.

Best,
Christian

On Mon, 17 Feb 2003, Murray Jorgensen wrote:

> I'm working on a rather interesting consulting problem with a client. A
> number of physical variables are measured on a number of cricket bowlers
> in the performance of a delivery. An example variable might be a
> directional component of angular momentum for a particular joint
> measured at a large number (101) of equally spaced timepoints.
>
> Each bowler generates a (fairly smooth) curve for each variable
> measured. I decided to represent each curve by a few orthogonal
> polynomial constrasts.
>
> There are 4 groups of bowlers corresponding to various speeds of
> delivery. I want to use canonical variant analysis to find linear
> combinations of my transformed variables discriminating well between the
> groups of bowlers.
>
> I used lda() from the MASS library to do this, but examining the output
> I notice that the higher-order orthogonal polynomials are getting larger
> coefficients than the more important lower-order ones. This is clearly
> because some scaling of the variables is being done by lda(), and
> because the higher-order polynomial vaiable values are smaller, they are
> scaled up.
>
> I would like to turn off this scaling as it is not what is needed in
> this problem and will cause the tail to "wag the dog". There is no
> obvious parameter to do this in
>
> lda(x,   grouping, prior = proportions, tol = 1.0e-4,
>                     subset, na.action = na.fail,
>                     method, CV = FALSE, nu)
>
> so I thought that I might try a hack. However:
>
>  > lda
> function (x, ...)
> {
>      if (is.null(class(x)))
>          class(x) <- data.class(x)
>      UseMethod("lda", x, ...)
> }
>
>
> Any ideas about how to perform an unscaled canonical variates analysis?
>
> Cheers,
>
> Murray
>

--
***********************************************************************
Christian Hennig
Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (currently)
and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg
hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/
hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/
#######################################################################
ich empfehle www.boag.de

```