[R] Discriminant Function Analysis

Uwe Ligges ligges at statistik.uni-dortmund.de
Tue Jul 5 20:42:37 CEST 2005


michael watson (IAH-C) wrote:

> Dear All
> 
> This is more of a statistics question than a question about help for R,
> so forgive me.
> 
> I am using lda from the MASS package to perform linear discriminant
> function analysis.  I have 14 cases belonging to two groups and have
> measured each of 37 variables.  I want to find those variables that best
> discriminate between the two groups, and I want to visualise that and
> create a classification function.  Please note at this stage it is a
> proof of concept problem - I realise that I must follow this up with a
> much more robust anaylsis involving cross-validation.
> 
> 1) First problem, I got this error message:
> 
>>z <- lda(C0GRP_NA ~ ., dpi30)
> 
> Warning message: 
> variables are collinear in: lda.default(x, grouping, ...) 
> 
> I guess this is not a good thing, however, I *did* get a result and it
> discriminated perfectly between my groups.  Can anyone explain what this
> means?  Does it invalidate my results?

Well, 14 cases and 37 variables mean that not that many degrees of 
freedom are left.... ;-)
Of course, you get a perfect fit - with arbitrary data.

> 
> 2) My analysis came up with one discriminant variable.  How do I control
> how many are produced?  I currently assume this is the only significant
> discriminant variable found.  Can I insist it finds more?

Well, if projection into one dimension is already perfect, it's hard to 
find a second one that improves the result...


> 3) More of a tip - when my analysis only finds one significant variable,
> what is a good way to visualise this graphically?

Depends of the amount of data, either all data on one line, maybe 
jittered, or maybe even beter two boxplot, given there would be really 
perfect (and sensible) separation ....


> 4) Can I work out from the coefficients which sub groups of my variable
> are better at discriminating than others?  I guess I could simply
> perform a t-test first to select the best variables...?

No, because you ignore possible projections in this case.


> 5) How do I turn my discriminant function into a classification
> function?  i.e. when I plot the scores for the groups I can see
> graphically that all the values for one group are below 0.1 and all the
> values for the other group are above 1.  But how do I turn my
> discriminant function into a classification function?

What about looking for the point where it has the value 0.5 for the 
posterior?

Uwe LIgges



> Many thanks in advance for your help
> 
> Mick
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list