[R-sig-eco] Discriminant Analysis Crisis

Chris Howden chris at trickysolutions.com.au
Mon Jun 24 03:09:20 CEST 2013


The lda function has a corresponding predict function which u can use
to classify new data. Try having a look at that and it's help
contents.

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and
Innovation, Data Analysis, Modelling and Training

(mobile) 0410 689 945
(fax / office)
chris at trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are not
the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy,
use or disclose this communication or any attachments without our
consent. Although this email has been checked by anti-virus software,
there is a risk that email messages may be corrupted or infected by
viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the
company. Tricky Solutions always does our best to provide accurate
forecasts and analyses based on the data supplied, however it is
possible that some important predictors were not included in the data
sent to us. Information provided by us should not be solely relied
upon when making decisions and clients should use their own judgement.

On 22/06/2013, at 11:19, Alexandre Fadigas de Souza
<alexsouza at cb.ufrn.br> wrote:

> Dear friends,
>
>  I have a doubt about how to use discriminant analysis. Could you possibly help me?
>
>  Briefly, here is my case.
>
>  I have clustered 64 subtropical tree species using k-means non-hierarchical cluster analysis. The cluster analysis was based on 10 continuous traits measured for individuals of each species. The analysis yielded 5 groups that interestingly correspond to empirical expectations about forest ecology.
>
>  I have some 20 more species for which I do not have all 10 variables but need to fit into the five groups classification. Based on the literature, I proceeded with a Discriminant Analysis, which selected three of the ten variables as most strongly related to the groups, and found regressions related to each group. There is a separate regression for each of the five groups.
>
>  And here comes my difficulties:
>
> 1 - How will I assign a given new species to any single group if there is a separate equation for each group? Should I fit the species data in each group's equation and then chech which one resulted in a result nearer to that group number? This seems very awkward.
>
> 2 - The DA provides an untransformed equation with a constant but also a standardized version withoug the constant and with one less group (4 instead of 5). I suppose I should use the unstandardized version, shouldn't I?
>
>   Thank you in advance for any help.
>
>   Sincerely,
>
>   Alexandre
>
> Dr. Alexandre F. Souza
> Professor Adjunto II Departamento de Botanica, Ecologia e Zoologia  Universidade Federal do Rio Grande do Norte (UFRN)  http://www.docente.ufrn.br/alexsouza  Curriculo: lattes.cnpq.br/7844758818522706
>
>
> ----- Mensagem original -----
> De: r-sig-ecology-request at r-project.org
> Para: r-sig-ecology at r-project.org
> Enviadas: Fri, 21 Jun 2013 07:00:01 -0300 (BRT)
> Assunto: R-sig-ecology Digest, Vol 63, Issue 17
>
> Send R-sig-ecology mailing list submissions to
>    r-sig-ecology at r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>    https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> or, via email, send a message with subject or body 'help' to
>    r-sig-ecology-request at r-project.org
>
> You can reach the person managing the list at
>    r-sig-ecology-owner at r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-sig-ecology digest..."
>
>
> Today's Topics:
>
>   1. Re: Working with trajectories: ltraj. (Clement Calenge)
>   2. New book: Beginner's Guide to GLM and GLMM with R
>      (Highland Statistics Ltd)
>   3. Re: ordipointlabel with shortened names (Gavin Simpson)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 20 Jun 2013 14:25:32 +0200
> From: Clement Calenge <clement.calenge at oncfs.gouv.fr>
> To: Dylann Kersusan <kersusan.dylann at gmail.com>
> Cc: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
> Subject: Re: [R-sig-eco] Working with trajectories: ltraj.
> Message-ID: <51C2F4BC.80900 at oncfs.gouv.fr>
> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
>
>
> [snip]
>> I am trying to select only a part of the trajectory stored in this ltraj
>> class object, following an example from Calenge in the package
>> 'adehabitatLT':
>>
>> lim <- as.POSIXct(strptime(c("28/05/2010 02:00:40", "30/06/2010 02:00:41"),
>> "%d/%m/%Y %H:%M:%S"))
>> ltrj.bis <- gdltraj(ltrj.obj, min=lim[1], max=lim[2], type="POSIXct" )
>>
>> but for some reasons it does not work properly. I got this error message
>> that I do not understand:
>>
>> Error in `[.default`(res, , names(res) %in% which, drop = FALSE) :
>>   number of dimensions incorrect
>>
>>
>> Anyone has an idea about why it doesn't work?
>
> You have identified a bug in the function gdltraj. I have corrected this
> bug and uploaded a new version of the package to CRAN. It will be
> available soon. Meanwhile, you can use this function instead:
>
> gdltraj <- function (x, min, max, type = c("POSIXct", "sec", "min", "hour",
>     "mday", "mon", "year", "wday", "yday"))
> {
>     if (!inherits(x, "ltraj"))
>         stop("x should be of class \"ltraj\"")
>     if (!attr(x, "typeII"))
>         stop("x should be of type II (time recorded)")
>     type <- match.arg(type)
>     if (type == "POSIXct") {
>         x <- lapply(x, function(y) {
>             infol <- attr(y, "infolocs")
>             if (!is.null(infol))
>                 infol <- infol[(y$date > min) & (y$date < max),
>                   , drop=FALSE]
>             y <- y[(y$date > min) & (y$date < max), ]
>             if (!is.null(infol))
>                 attr(y, "infolocs") <- infol
>             return(y)
>         })
>     }
>     else {
>         x <- lapply(x, function(y) {
>             da <- as.POSIXlt(y$date)[[type]]
>             infol <- attr(y, "infolocs")
>             if (!is.null(infol))
>                 infol <- infol[(da >= min) & (da < max), ,drop=FALSE]
>             y <- y[(da >= min) & (da < max), ]
>             if (!is.null(infol))
>                 attr(y, "infolocs") <- infol
>             return(y)
>         })
>     }
>     if (all(sapply(x, nrow) == 0))
>         stop("No relocations within the specified interval")
>     x[sapply(x, nrow) == 0] <- NULL
>     class(x) <- c("ltraj", "list")
>     attr(x, "typeII") <- TRUE
>     attr(x, "regular") <- is.regular(x)
>     x <- rec(x)
>     return(x)
> }
>
> HTH
>
> Cl?ment Calenge
>
> --
> Cl?ment CALENGE
> Cellule d'appui ? l'analyse de donn?es
> Direction des Etudes et de la Recherche
> Office national de la chasse et de la faune sauvage
> Saint Benoist - 78610 Auffargis
> tel. (33) 01.30.46.54.14
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 20 Jun 2013 13:29:22 +0100
> From: Highland Statistics Ltd <highstat at highstat.com>
> To: r-sig-ecology at r-project.org
> Subject: [R-sig-eco] New book: Beginner's Guide to GLM and GLMM with R
> Message-ID: <51C2F5A2.5060208 at highstat.com>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Members of this mailing list may be interested in the following new book:
>
>
> Beginner's Guide to GLM and GLMM with R.
> - A frequentist and Bayesian perspective for ecologists -
>
> Zuur AF, Hilbe JM and Ieno EN
>
>
> This book is only available from:
> http://www.highstat.com/BGGLM.htm
>
>
>
> This book presents Generalized Linear Models (GLM) and Generalized
> Linear Mixed Models (GLMM) based on both frequency-based and Bayesian
> concepts. Using ecological data from real-world studies, the text
> introduces the reader to the basics of GLM and mixed effects models,
> with demonstrations of binomial, gamma, Poisson, negative binomial
> regression, and beta and beta-binomial GLMs and GLMMs. The book uses the
> functions glm, lmer, glmer, glmmADMB, and also JAGS from within R. JAGS
> results are compared with frequentist results.
>
> R code to construct, fit, interpret, and comparatively evaluate models
> is provided at every stage. Otherwise challenging procedures are
> presented in a clear and comprehensible manner with each step of the
> modelling process explained in detail, and all code is provided so that
> it can be reproduced by the reader.
>
> Readers of this book have free access to:
>
> Chapter 1 of Zero Inflated Models and Generalized Linear Mixed Models
> with R. (2012a) Zuur, Saveliev, Ieno.
> Chapter 1 of Beginner's Guide to Generalized Additive Models with R.
> (2012b) Zuur, AF.
>
>
> Keywords
> Introduction to GLM
> Poisson GLM and Negative binomial GLM for count data
> Binomial GLM for binary data
> Binomial GLM for proportional data
> Other distributions
> GLM applied to red squirrel data
> Bayesian approach ? running the Poisson GLM
> Running JAGS via R
> Applying a negative binomial GLM in JAGS
> GLM applied to presence-absence Polychaeta data
> Model selection using AIC, DIC and BIC in jags
> Introduction to mixed effects models
> GLMM applied on honeybee pollination data
> Poisson GLMM using glmer and JAGS
> Negative binomial GLMM using glmmADMD and JAGS
> GLMM with auto-regressive correlation
> GLMM for strictly positive data: biomass of rainforest trees
> gamma GLM using a frequentist approach
> Fitting a gamma GLM using JAGS
> Truncated Gaussian linear regression
> Tobit model in JAGS
> Tobit model with random effects in JAGS
> Binomial, beta-binomial, and beta GLMM applied to cheetah data
>
> Kind regards,
>
> Alain Zuur
>
>
>
>
> --
>
> Dr. Alain F. Zuur
> First author of:
>
> 1. Analysing Ecological Data (2007).
> Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
> URL: www.springer.com/0-387-45967-7
>
>
> 2. Mixed effects models and extensions in ecology with R. (2009).
> Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
> http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9
>
>
> 3. A Beginner's Guide to R (2009).
> Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
> http://www.springer.com/statistics/computational/book/978-0-387-93836-3
>
>
> 4. Zero Inflated Models and Generalized Linear Mixed Models with R. (2012) Zuur, Saveliev, Ieno.
> http://www.highstat.com/book4.htm
>
> Other books: http://www.highstat.com/books.htm
>
>
> Statistical consultancy, courses, data analysis and software
> Highland Statistics Ltd.
> 6 Laverock road
> UK - AB41 6FN Newburgh
> Tel: 0044 1358 788177
> Email: highstat at highstat.com
> URL: www.highstat.com
> URL: www.brodgar.com
>
>
>
> ------------------------------
>
> Message: 3
> Date: Thu, 20 Jun 2013 08:08:17 -0600
> From: Gavin Simpson <gavin.simpson at ucl.ac.uk>
> To: Kevin McCluney <kemcclun at ncsu.edu>
> Cc: r-sig-ecology at r-project.org
> Subject: Re: [R-sig-eco] ordipointlabel with shortened names
> Message-ID: <1371737297.2393.13.camel at chrysothemis>
> Content-Type: text/plain; charset="UTF-8"
>
> Resending as I had some mail trouble yesterday and don't see this one to
> have gone through. See in-line below...
>
> On Tue, 2013-06-18 at 12:58 -0700, Kevin McCluney wrote:
>> Hi,
>>
>> I've been trying to use ordipointlabel() to add taxa names to an nmds
>> (metaMDS) graph in VEGAN.  I can add the full names from the database, but I
>> would like to use the shortened names I created using make.cepnames().  I've
>> tried:
>>
>> pl3v2 <- ordipointlabel(mds, dis="sp", add = TRUE, lab=shnam)
>
> I'm afraid this is not possible yet - the labels are hard coded from the
> species scores. Hence I suggest you do
>
> names(foo) <- make.cepnames(names(foo))
>
> where foo is your data frame. Then refit the NMDS.
>
> I'll see about allowing the passing in of labels but it won't happen for
> a few weeks.
>
> HTH
>
> G
>
>> But I get the following error message:
>>
>> Error in text.default(lab, labels = labels, col = col, cex = cex, font =
>> font,  :
>>  graphical parameter "lab" has the wrong length
>> In addition: Warning message:
>> In text.default(lab, labels = labels, col = col, cex = cex, font = font,  :
>>  NAs introduced by coercion
>>
>> I've also tried "labels" instead of "lab" and I get:
>>
>> Error in text.default(lab, labels = labels, col = col, cex = cex, font =
>> font,  :
>>  formal argument "labels" matched by multiple actual arguments
>>
>> I guess this function doesn't yet work like orditorp?  Can anyone think of a
>> workaround?
>>
>> I've tried orditkplot:
>>
>> orditkplot(pl3v2, dis = "sp")
>>
>> But I get this error:
>>
>> Error in structure(.External("dotTclObjv", objv, PACKAGE = "tcltk"), class =
>> "tclObj") :
>>  [tcl] bad screen distance "-NaN".
>>
>> I've also tried using identify(), but I have a few taxa that are literally
>> right on top of each other and try as I might, I can't get all of the taxa
>> that are on top of each other to show up.
>>
>> Any help would be appreciated.  Thanks!
>>
>> Kevin E. McCluney, PhD
>> Post-doctoral Research Scholar
>> Department of Entomology
>> North Carolina State University
>>
>>
>>
>> --
>> View this message in context: http://r-sig-ecology.471788.n2.nabble.com/ordipointlabel-with-shortened-names-tp7578224.html
>> Sent from the r-sig-ecology mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> R-sig-ecology mailing list
>> R-sig-ecology at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> --
> Gavin Simpson, PhD                          [t] +1 306 337 8863
> Adjunct Professor, Department of Biology    [f] +1 306 337 2410
> Institute of Environmental Change & Society [e] gavin.simpson at uregina.ca
> 523 Research and Innovation Centre          [tw] @ucfagls
> University of Regina
> Regina, SK S4S 0A2, Canada
>
>
> --
> Gavin Simpson, PhD                          [t] +1 306 337 8863
> Adjunct Professor, Department of Biology    [f] +1 306 337 2410
> Institute of Environmental Change & Society [e] gavin.simpson at uregina.ca
> 523 Research and Innovation Centre          [tw] @ucfagls
> University of Regina
> Regina, SK S4S 0A2, Canada
>
>
>
> ------------------------------
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
> End of R-sig-ecology Digest, Vol 63, Issue 17
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



More information about the R-sig-ecology mailing list