[R] Discriminant Correspondence Analysis

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Dec 15 09:46:12 CET 2010

On Wed, 2010-12-15 at 02:48 -0500, Wayne Sawtell wrote:
> Thank you all for the advice.
> I have looked through the Introduction to R pdf and got some pointers but
> when I try to implement them it does not work. If someone could clarify a
> couple of basic things, I would appreciate it.
> When I successfully read in my file, the prompt changed from > to +. Then
> when I typed in the suggested commands, nothing happened.

That means that the commands you used to read in the file were
syntactically incomplete. In such cases, R changes the prompt from `>`
to `+` indicating that the previous line(s) were incomplete and extra
input is require.

New useRs hit this problem very often, usually because they forgot to
close quotes (in your case around the filename?) or perhaps a missing
closing parenthesis ( `)` ). The former causes them no end of
frustration because, of course, until one closes the quote, R thinks you
are just typing in a long string.

I would suggest your file was not read in correctly or even at all.
Check you have matched quotes and parentheses. If you need, get a better
text editor that will highlight code and match brackets to help you spot
the errors. Tinn-R might be useful for you for example, on Windows.

> For the discrimin.coa command, the only part I don't understand is what to
> put for "fac".

Discriminants analysis finds linear combinations of variables that best
separate your group means. I don't know discriminant CA but it will be
doing something similar. The point is to find "rules" that predict the
groups. So 'fac' is where you tell `discrimin.coa` what groups your data
belong to, as a factor.

> Is this the grouping variable that I obtained from my
> Principal Co-ordinates Analysis?

Doubt it - in what sense does PCoA estimate groups or clusters of
samples? PCoA just extracts linear combinations of your variables that
have maximal "variance". It is a bit like PCA but using any
dissimilarity matrix. I'm not familiar with PCoA being used to provide a
grouping variable.

> My goal, by the way, is to test whether the
> groups into which PCoA put my data are valid.

See the above; I doubt PcoA will be useful here.

> The data consist of specimen
> measurements and categorical observations. So I have a rectangular table of
> data with headings (names of measured characters) at the top of each column
> of numbers. This is a sample:
> X1       X2      X3       X4   X5
> 0.123  0.854  0.319  1     2
> 0.562  0.472  0.917  0     1
> 0.381  0.285  0.146  2     1
> where X4 is a body shape character, which I've converted to numbers, instead
> of words (0 - round, 1 - oblong, 2 - rectangular).

Don't do that. Store this as a factor in R with the labels "round",
"oblong" etc. R will store these numerically in R as 1, 2, 3, etc but
will display them with nice names so you don't have to remember what the
codes mean.

> I've included X5, which
> is just the column in which I entered the group number into which PCoA
> grouped the data points or rows (each row represents a different specimen
> that was measured according to the characters in the headings). So, should I
> put "fac = X5"? Is that how Discriminant Correspondence Analysis works?

Unlikely. You are going to have to remove 'X5' from the data that you
pass as argument `df` and you can't refer to X5 directly like that
without some extra efforts. You could try:

discrimin.coa(DF[, -5], fac = as.factor(DF[, 5]))

> thanks again and sorry if my question is too long

You seem to be missing a lot of the basics and also don't fully get the
statistical methods you are using; a bad combination in my book.

ADE4 is well documented, so check and run through the examples for some
of the functions in the package to familiarise yourself with how things
work. If you need specific help, there is an ade4 mailing list which is
likely best placed for you to post your questions regarding use of
functions in that package.

Good luck,



> Wayne
> On 14 December 2010 18:39, Peter Ehlers <ehlers at ucalgary.ca> wrote:
> > Wayne,
> >
> > So far, no one has said the obvious:
> > Please do work your way through (or at least
> > skim) "An Introduction to R" which you'll
> > find right there on your computer under
> > Help/Manuals. Your questions indicate that
> > you have not yet done so. Do it, it really
> > will pay off.
> >
> > Peter Ehlers
> >
> >
> > On 2010-12-14 12:36, Wayne Sawtell wrote:
> >
> >> Hello everyone,
> >>
> >> I am totally new to the R program. I have had a look at some pdf documents
> >> that I downloaded and that explain how to do many things in R; however, I
> >> still cannot figure out how to do what I want to do, which is to perform
> >> Discriminant Correspondence Analysis on a rectangular matrix of data that
> >> I
> >> have in an Excel file. I know R users frown upon Excel and recommend
> >> converting Excel files to .csv format, which I have done, no problem. That
> >> is not an issue.
> >> There are several parts to my problem.
> >> 1) When I try the read.table command, even if I include the directory name
> >> in the filename, R still cannot read the file, even if it is in .csv
> >> format
> >> 2) I was able to copy my file and then read the clipboard contents into R
> >> but then I do not know to assign a name to the data frame in order to
> >> conduct any operations on it
> >> 3) I need the ADE4 program in order to perform Discriminant Correspondence
> >> Analysis, so I used the "install.packages" command to install it. It
> >> installed no problem but I do not know how to access the ADE4 program in
> >> R.
> >> I am unable to open it directly, either.
> >> 4) I thought that using the ADE4 GUI (called "ade4TkGUI") would be easier
> >> because I do not know many of the R commands; but, again, I downloaded it
> >> but cannot open or access it.
> >>
> >> The following is the suggested coding that I found through the R website,
> >> but when I try to use this code, I don't know how to assign a name for the
> >> df, or what to put for "fac", and what is worse, I get an error message
> >> saying that the program cannot find the "discrimin.coa" command.
> >>
> >>
> >> Usage
> >>
> >> discrimin.coa(df, fac, scannf = TRUE, nf = 2)
> >>
> >> Arguments
> >>
> >> df a data frame containing positive or null values
> >>
> >> fac a factor defining the classes of discriminant analysis
> >>
> >> scannf a logical value indicating whether the eigenvalues bar plot should
> >> be
> >> displayed
> >>
> >> nf if scannf FALSE, an integer indicating the number of kept axes
> >>
> >> Examples
> >>
> >> data(perthi02)
> >>
> >> plot(discrimin.coa(perthi02$tab, perthi02$cla, scan = FALSE))
> >> For clarification, my data consists of measurements of morphological
> >> characters of an assemblage of biological specimens. I have already
> >> performed Principal Co-ordinates Analysis, Principal Compionents Analysis
> >> and Cluster Analysis in another program (PAST) in order to see if the data
> >> fall into distinct groupings that might represent different morphological
> >> species. I now want to test the groupings that I found on my test data set
> >> using Discriminant Correspondence Analysis.There are both continuous and
> >> categorical characters, which is the reason why I need to perform
> >> Discriminant Correspondence Analysis, instead of Linear Discriminant
> >> Analysis, which is only valid for continuous measurements. R seems to be
> >> the
> >> only program in which I can perform Discriminant Correspondence Analysis.
> >>
> >> Thanks for any help offered on any of these points.
> >> Wayne
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> 	[[alternative HTML version deleted]]
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk

More information about the R-help mailing list