# [R] Animal Morphology: Deriving Classification Equation with

(Ted Harding) Ted.Harding at manchester.ac.uk
Sun May 24 23:46:16 CEST 2009

```On 24-May-09 20:32:06, cdm wrote:
> Dear Ted,
> Thank you for taking the time out to help me with this analysis.
> I'm seeing that I may have left out a crucial detail concerning
> this analysis. The ID measurement (interpubic distance) is a new
> measurement that has never been used in the field of ornithology
> (to my knowledge). The objective of the paper is to demonstrate
> the usefulness of ID. The paper compared ID with plumage criterion,
> a categorical variable at best, but under peer-review there is a
> request to use other morphological data to compare/contrast ID.
> Unfortunately, wing (WG) and weight (WT) were the only measurements
> taken in addition to ID in this study.

Many thanks for the above additional explanation, Chase. It leads to
an interpretation of the log(ID) vs log(LD) plot which could be fruitful.
Namely, the ID is a linear dimension, and the WT could be considered
as closely reflecting a (linear dimsnion)^3. If you look at the plot
of log(WT) vs log(ID):

## Plot log(WT) vs log(ID) (M & F)
plot(lID,lWT)
points(lID[ix.M],lWT[ix.M],pch="+",col="blue")
points(lID[ix.F],lWT[ix.F],pch="+",col="red")

it is apparent that a linear increase in log(ID) as log(WT) increases
is a very good description of what is happening. Also, that the
scatter about the linear relationship is very uniform. Therefore,
a linear regression of log(ID) on log(WT) should be closely related
to the linear discrimination. First, the linear regression:

lLM <- lm(lID ~ lWT)
summary(lLM)\$coef
#               Estimate Std. Error   t value     Pr(>|t|)
# (Intercept) -10.657775  0.6562166 -16.24125 5.971407e-35
# lWT           4.901037  0.2671783  18.34369 2.899008e-40

so the slope is 4.901037, and the slope of a linear discriminant
is likely to be close to -1/4.901037 = 0.2040385. So:

library(MASS)
lda(SEX ~ lWG + lWT + lID)
# [...]
# Coefficients of linear discriminants:
#            LD1
# lWG   5.304967
# lWT -11.604919
# lID  -2.707374

so the slope of a linear discriminant (based on all 3 variables)
with respect to variation in log(WT) and log(ID) alone is
-2.707374/11.604919 = -0.2332954
which is quite close to the above. It is also interesting to do the
discrimination using only log(WT) and log(ID):

lda(SEX ~ lWT + lID)
# [...]
# Coefficients of linear discriminants:
#            LD1
# lWT -11.352949
# lID  -2.673019

So *very little change* compared with using all three variables;
and the slope of this discriminant is -2.673019/11.352949 =  -0.2354471,
almost unchanged compared with the three variables.

You can see the performance of the discriminator by plotting
histograms of it (here I'll use the 2-variable one):

ix.M <- (SEX=="M") ; ix.F <- (SEX=="F")
LD <- 11.352949*lWT + 2.673019*lID
hist((2.673019*lID + 11.352949*lWT)[ix.M],
breaks=0.5*(40:80),col="blue")
hist((2.673019*lID + 11.352949*lWT)[ix.F],

Inspection of this, however, raises some interesting questions
which I'd prefer to discuss with you off-list (also your queries
relating to efficacy of ID).

Ted.
[But see just one short comment below]

> The purpose of the LDA is to demonstrate the power if ID in the
> context of WG and WT. I agree that WG is a terrible metric for
> discrimination, WT is good but there is significant overlap between
> groups, but ID is a good discriminator on it's own (classified 97-100%
> of all individuals based on 92.5% CI).
>
> You pointed out that I am violating assumptions with LDA based on
> different covariances between sexes (thank you... I never would have
> caught it). I'm wondering how to proceed.

As pointed out in my correction, if you work with logs it looks OK
on that front! More later.

> Should I:
>
> 1) Perform linear discrimination with WT and ID, and then determine a
> classification equation? And, if I do how do I derive the
> classification
> equation (e.g. [Cj = cj0+ cjWTxWT+ cjIDxID; Cj>x= male, Cj<x=female])
>
> 2) Demonstrate that ID is important based on linear discrimanant
> coefficients and structure coefficients from this WG, WT, and ID LDA;
> discuss the assumption violation and argue for it's use as a
> demonstration
> of variable predicting power; and NOT provide a classification equation
> because we already have ID ranges and it would be inappropriate.
>
> 3) Both #1 and #2 because WT and ID provide such a good discriminating
> function and use the WG, WT, and ID LDA for demonstration of variable
> prediction value.
>
> 4) ??? better suggestions.
>
>
>
>
> THANK YOU so much for responding and all of your insight. I'm humbled
> by
> your R skills... that code nearly too me all day to write (little by
> little
> I'm learning).
>
> Chase
>
>
>
> Ted.Harding-2 wrote:
>>
>> [Your data and output listings removed. For comments, see at end]
>>
>> On 24-May-09 13:01:26, cdm wrote:
>>> Fellow R Users:
>>> I'm not extremely familiar with lda or R programming, but a recent
>>> editorial review of a manuscript submission has prompted a crash
>>> course. I am on this forum hoping I could solicit some much needed
>>> advice for deriving a classification equation.
>>>
>>> I have used three basic measurements in lda to predict two groups:
>>> male and female. I have a working model, low Wilk's lambda, graphs,
>>> coefficients, eigenvalues, etc. (see below). I adjusted the sample
>>> analysis for Fisher's or Anderson's Iris data provided in the MASS
>>> library for my own data.
>>>
>>> My final and last step is simply form the classification equation.
>>> The classification equation is simply using standardized coefficients
>>> to classify each group- in this case male or female. A more thorough
>>> explanation is provided:
>>>
>>> "For cases with an equal sample size for each group the
>>> classification
>>> function coefficient (Cj) is expressed by the following equation:
>>>
>>> Cj = cj0+ cj1x1+ cj2x2+...+ cjpxp
>>>
>>> where Cj is the score for the jth group, j = 1 Ã¢Â€Â¦ k, cjo is the
>>> constant for the jth group, and x = raw scores of each predictor.
>>> If W = within-group variance-covariance matrix, and M = column matrix
>>> of means for group j, then the constant   cjo= (-1/2)CjMj" (Julia
>>> Barfield, John Poulsen, and Aaron French
>>> http://userwww.sfsu.edu/~efc/classes/biol710/discrim/discriminant.htm)
>>> .
>>>
>>> I am unable to navigate this last step based on the R output I have.
>>> I only have the linear discriminant coefficients for each predictor
>>> that would be needed to complete this equation.
>>>
>>> Please, if anybody is familiar or able to to help please let me know.
>>> There is a spot in the acknowledgments for you.
>>>
>>> All the best,
>>> Chase Mendenhall
>>
>> The first thing I did was to plot your data. This indicates in the
>> first place that a perfect discrimination can be obtained on the
>> basis of your variables WRMA_WT and WRMA_ID alone (names abbreviated
>> to WG, WT, ID, SEX):
>>
>>   d.csv("horsesLDA.csv")
>>   # names(D0) # "WRMA_WG"  "WRMA_WT"  "WRMA_ID"  "WRMA_SEX"
>>   WG<-D0\$WRMA_WG; WT<-D0\$WRMA_WT;
>>   ID<-D0\$WRMA_ID; SEX<-D0\$WRMA_SEX
>>
>>   ix.M<-(SEX=="M"); ix.F<-(SEX=="F")
>>
>>   ## Plot WT vs ID (M & F)
>>   plot(ID,WT,xlim=c(0,12),ylim=c(8,15))
>>   points(ID[ix.M],WT[ix.M],pch="+",col="blue")
>>   points(ID[ix.F],WT[ix.F],pch="+",col="red")
>>   lines(ID,15.5-1.0*(ID))
>>
>> and that there is a lot of possible variation in the discriminating
>> line WT = 15.5-1.0*(ID)
>>
>> Also, it is apparent that the covariance between WT and ID for Females
>> is different from the covariance between WT and ID for Males. Hence
>> the assumption (of common covariance matrix in the two groups) for
>> standard LDA (which you have been applying) does not hold.
>>
>> Given that the sexes can be perfectly discriminated within the data
>> on the basis of the linear discriminator (WT + ID) (and others),
>> the variable WG is in effect a close approximation to noise.
>>
>> However, to the extent that there was a common covariance matrix
>> to the two groups (in all three variables WG, WT, ID), and this
>> was well estimated from the data, then inclusion of the third
>> variable WG could yield a slightly improved discriminator in that
>> the probability of misclassification (a rare event for such data)
>> could be minimised. But it would not make much difference!
>>
>> However, since that assumption does not hold, this analysis would
>> not be valid.
>>
>> If you plot WT vs WG, a common covariance is more plausible; but
>> there is considerable overlap for these two variables:
>>
>>   plot(WG,WT)
>>   points(WG[ix.M],WT[ix.M],pch="+",col="blue")
>>   points(WG[ix.F],WT[ix.F],pch="+",col="red")
>>
>> If you plot WG vs ID, there is perhaps not much overlap, but a
>> considerable difference in covariance between the two groups:
>>
>>   plot(ID,WG)
>>   points(ID[ix.M],WG[ix.M],pch="+",col="blue")
>>   points(ID[ix.F],WG[ix.F],pch="+",col="red")
>>
>> This looks better on a log scale, however:
>>
>>   lWG <- log(WG) ; lWT <- log(WT) ; lID <- log(ID)
>> ## Plot log(WG) vs log(ID) (M & F)
>>   plot(lID,lWG)
>>   points(lID[ix.M],lWG[ix.M],pch="+",col="blue")
>>   points(lID[ix.F],lWG[ix.F],pch="+",col="red")
>>
>> and common covaroance still looks good for WG vs WT:
>>
>>   ## Plot log(WT) vs log(WG) (M & F)
>>   plot(lWG,lWT)
>>   points(lWG[ix.M],lWT[ix.M],pch="+",col="blue")
>>   points(lWG[ix.F],lWT[ix.F],pch="+",col="red")
>>
>> but there is no improvement for WG vs IG:
>>
>>   ## Plot log(WT) vs log(ID) (M & F)
>>   plot(ID,WT,xlim=c(0,12),ylim=c(8,15))
>>   points(ID[ix.M],WT[ix.M],pch="+",col="blue")
>>   points(ID[ix.F],WT[ix.F],pch="+",col="red")
>>
>> So there is no simple road to applying a routine LDA to your data.
>>
>> To take account of different covariances between the two groups,
>> you would normally be looking at a quadratic discriminator. However,
>> as indicated above, the fact that a linear discriminator using
>> the variables ID & WT alone works so well would leave considerable
>> imprecision in conclusions to be drawn from its results.
>>
>> Sorry this is not the straightforward answer you were hoping for
>> (which I confess I have not sought); it is simply a reaction to
>> what your data say.
>>
>> Ted.
>>
>> --------------------------------------------------------------------
>> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 24-May-09                                       Time: 20:07:43
>> ------------------------------ XFMail ------------------------------
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context:
> http://www.nabble.com/Animal-Morphology%3A-Deriving-Classification-Equat
> ion-with-Linear-Discriminat-Analysis-%28lda%29-tp23693355p23697743.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 24-May-09                                       Time: 22:46:12
------------------------------ XFMail ------------------------------

```