[R-sig-eco] anova.cca question / missing data in constraining matrix

Jari Oksanen jari.oksanen at oulu.fi
Sat Jun 1 07:18:50 CEST 2013


On 01/06/2013, at 05:20 AM, Jari Oksanen wrote:
>> 
>> The CCA seems to run just fine, but when I attempt to do the posthoc tests
>> such as anova.cca (anova(toolik250.cca,by='terms',perm=999), I get an error
>> message: "Error in anova.ccabyterm(object, step = step, ...) : number of
>> rows has changed: remove missing values?"  What exactly is occurring here to
>> cause this error - I suspect it must be related to the fact that the
>> environmental data matrix has a lot of missing data.  I don't quite
>> understand why it states that the number of rows has changed...changed from
>> what?  
> 
> The number of rows has changed from term to term. That is, you have different numbers of missing values in each term (= explanatory variable), and when rows with missing values are removed for the current model, the accepted observations change from term to term. I admit the error message is not the most obvious one. I must see where it comes from, and how to make it more informative. However, it does give a hint to "remove missing values", doesn't it?
> 
> If you want to have a term-wise test with missing values in terms, you must refit your model for complete.cases.  Use argument 'subset' to select a subset with no missing values. Currently I don't know any nice short cut to do this with the current mode, but the following may work (untested), although it is not nice:
> 
> keep <- rep(TRUE, nrow(tooliken.s)
> keep[toolik250.cca$na.action] <- FALSE
> m2 <- update(toolik250.cca, subset = keep)
> anova(m2, by="terms", perm=999)

Actually, there is a bit easier way of doing this, because 'subset' can also be a vector of indices, and negative indices acn be used to remove observations. If 'toolik250.cca' is a result object with missing observations, then

m2 <- update(toolik250.cca, subset = -toolik250.cca$na.action)

will remove items listed as removed in 'na.action' (NB. the minus sign in 'subset'). The update()d model will be equal to the original model, but missing data removed.

Cheers, Jari Oksanen 

-- 
Jari Oksanen, Dept Biology, Univ Oulu, 90014 Finland
jari.oksanen at oulu.fi, Ph. +358 400 408593, http://cc.oulu.fi/~jarioksa



More information about the R-sig-ecology mailing list