[R] (no subject)

Jari Oksanen jari.oksanen at oulu.fi
Wed Sep 9 18:28:00 CEST 2009


Hello Kim & Gav,

Gavin Simpson <gavin.simpson <at> ucl.ac.uk> writes:

> 
> On Wed, 2009-09-09 at 14:43 +0200, Kim Vanselow wrote:
> > Dear r-Community,
> > Step1: I would like to calculate a NMDS (package vegan, function
> > metaMDS) with species data.
> > Step2: Then I want to plot environmental variables over it, using
> > function envfit.
> > The Problem: One of these environmental variables is cos(EXPOSURE).
> > But for flat releves there is no exposure. The value is missing and I
> > can't call it 0 as 0 stands for east and west. Therefore I kicked all
> > releves with missing environmental variables. Both, metaMDS and envfit
> > then work without problems.
> > Now I want to bring the releves with missing environmetal variables
> > back to my ordination-plot.
> > 
> > Gavin Simpson gave me the advice to use the predict-function for the
> > same missing value problem when I was calculating a cca. This worked
> > without problem.
>

> 
> Also note that Jari has commented on how you are coding your Exposure
> variable; I glossed over that bit when providing my response and you
> should probably rethink your coding along the lines Jari suggested.
>
Yes, and Dylan Beaudette in gave some more concrete ideas in this thread 
(provided it is the Exposure to sun, and not to, say, wind or pollution source).
 
> There isn't a predict function for metaMDS() because there aren't rules
> or relationships that would allow you to do what predict.cca does but
> for a nMDS. In the CCA case we were estimating the locations in
> ordination space for the left-out samples on the basis of their species
> composition and computing their site score as a weighted average of the
> species scores determined from the data that went into building the CCA.
>
Actually, you can add new points to NMDS. I once wrote a function for one
user who asked this, but I did not have it in vegan, because its use needed
great technical skill, and was too tricky for a general package. For instance, 
you need to have a rectangular dissimilarity matrix between new points and 
old points (which you can find with Gav's distance() function in analogue). 

> 
> What you now want to do is superimpose upon that plot the env data where
> you might have missingness. envfit doesn't allow missings and it is not
> immediately clear to me how it might be modified to do so, certainly not
> without a lot of changes.
>
True. Seems to be bad design. You should change four functions and the
user interface to have this. Even in that case it would be na.rm (remove rows
with any missing data), and if  you want to do that, you can do it by hand:

scor <- scores(mymetaMDSresult)
keep <- complete.cases(myenvdata)
envfit(scor[keep,] ~., myenvdata[keep,])
 
> Instead, ordisurf() may be more useful, but you will loose the nice
> "plot all vectors on the ordination at once" feature of envfit (you gain
> a lot with ordisurf though as there is no reason to assume anything is
> linear across an nMDS configuration).
> 
> A cursory look at the guts of ordisurf indicates that it can handle
> missing values in the (env) variable you wish to overlay onto the nMDS
> plot; the data is passed to mgcv::gam usig the formula interface which
> deals with the NA.
>
Yes, this is true (in most cases).
 

> 
> You could also try capscale() also in vegan. This is like CCA and RDA
> but allows you to use any dissimilarity coefficient. It is a bit like a
> constrained Principal Coordinates Analysis. This can use the rda method
> for predict to do what you did with the CCA earlier.
> 
However, it does not handle missing values directly (not even in the R-Forge
version, and there are no immediate plans to change this). So you must
remove missing data rows manually.

Cheers, Jari Oksanen




More information about the R-help mailing list