[R-sig-eco] NA error in envfit

Jari Oksanen jari.oksanen at oulu.fi
Fri Dec 6 19:07:37 CET 2013


On 05/12/2013, at 18:42 PM, Dixon, Philip M [STAT] wrote:
> 
> I wonder if the problem is a factor level with no observations.  One of the frustrating things about factors (class variables) in R is that the list of levels is stored separately from the data.  This can cause all sorts of problems if you create the factor, then subset the data, and the subset is missing one or more levels of the factor.  You are subsetting your data, so this may be the source of the problem.
> 
> My working philosophy is to keep variables as character strings or numbers until just before I need the factors.  That avoids any issues with extraneous levels.  That means reading data sets (.txt or .csv files) with as.is=TRUE to avoid default creation of factors.  relevel() may recreate the list of levels.  I usually use factor(as.character(variable)) to flip a factor to a vector of character strings then back to a factor with the correct set of levels.

Philip,

It very much look that this kind of approach is the source of all evil. We *assume* in envfit() that if a variable is not a factor, then it is numeric. If it is a character string instead being numeric, you get those strange error messages. We do take care of the extraneous factor levels in envfit, but we expected that variables are either factors or numeric -- we did not expect character strings. I guess we have to add some ugly code to handle these cases and either cast character strings to factors or ignore variables that are neither numeric nor factors.

Cheers, Jari Oksanen


More information about the R-sig-ecology mailing list