[R] anova() interpretation and error message
Peter Ehlers
ehlers at ucalgary.ca
Sun Feb 6 16:42:59 CET 2011
See comments inline.
On 2011-02-06 03:17, Jinsong Zhao wrote:
> Hi there,
>
> I have a data frame as listed below:
>
> > Ca.P.Biomass.A
> P Biomass
> 1 334.5567 0.2870000
> 2 737.5400 0.5713333
> 3 894.5300 0.6393333
> 4 782.3800 0.5836667
> 5 857.5900 0.6003333
> 6 829.2700 0.5883333
>
> I have fit the data using logistic, Michaelis–Menten, and linear model,
> they all give significance.
>
> > fm1<- nls(Biomass~SSlogis(P, phi1, phi2, phi3), data=Ca.P.Biomass.A)
> > fm2<- nls(Biomass~SSmicmen(P, phi1, phi2), data=Ca.P.Biomass.A)
> > fm3<- lm(Biomass~P, data = Ca.P.Biomass.A)
>
> I hope to compare the difference among the three models, and I using
> anova(). As for the example here, the three models seem not have
> significant difference. However, I am confused by the negative df in the
> following ANOVA table. And my question is how to interpret the results,
> if the Pr< 0.05.
>
> > anova(fm1,fm2,fm3)
> Analysis of Variance Table
>
> Model 1: Biomass ~ SSlogis(P, phi1, phi2, phi3)
> Model 2: Biomass ~ SSmicmen(P, phi1, phi2)
> Model 3: Biomass ~ P
> Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
> 1 3 0.00063741
> 2 4 0.00087249 -1 -0.00023508 1.1064 0.3701
> 3 4 0.00142751 0 0.00000000
Read the Details section in help(anova.lm) to see why
you get negative DF values. The reason is simply that
3 - 4 = -1. Compare
anova(fm1, fm2)
with
anova(fm2, fm1)
That's why I prefer to list my models in order of increasing
complexity. Note also that for interpretation models should
be nested. Yours aren't.
>
> And when the argument position changed, the anova() give different
> results. It seems the anova() compare the first model with all other models.
>
> > anova(fm2,fm1,fm3)
> Analysis of Variance Table
>
> Model 1: Biomass ~ SSmicmen(P, phi1, phi2)
> Model 2: Biomass ~ SSlogis(P, phi1, phi2, phi3)
> Model 3: Biomass ~ P
> Res.Df Res.Sum Sq Df Sum Sq F value Pr(>F)
> 1 4 0.00087249
> 2 3 0.00063741 1 0.00023508 1.1064 0.3701
> 3 4 0.00142751 -1 -0.00079010 3.7187 0.1494
>
> When I put the fm3, a linear model, in the first position, and two nls
> model following it, anova() give the following error message. It seems
> abnormal.
Not so strange. If you look at
methods(anova)
you will see that there are different functions for
different classes of model. So, if your first model is
of class "nls", then anova.nls will be used; for a
first model of class "lm" anova.lm is used. The main
thing is not to mix models of different class. If you
want a linear model to compare with the nonlinear
ones, use nls() to estimate the linear model. But you
still have the problem of non-nestedness.
>
> > anova(fm3,fm1,fm2)
> Analysis of Variance Table
>
> Response: Biomass
> Df Sum Sq Mean Sq F value Pr(>F)
> P 1 0.081163 0.081163 227.43 0.0001127 ***
> Residuals 4 0.001428 0.000357
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> Warning message:
> In anova.lmlist(object, ...) :
> models with response c("NULL", "NULL") removed because response
> differs from model 1
>
> Any suggestions and comments will be really appreciated. Thanks in advance.
>
After the above comments, I have one more:
You have 6 observations and you want to fit relatively
complex models whose estimated coefficients will be
*extremely* sensitive to *one* of your observations
(I assume that you have looked at a plot of the data).
The only way this could make any sense is if there is
well established theory that specifies one particular
model and you want to check that your data are at
least not obviously inconsistent with that theory.
Fishing through several models with six data points
makes no sense. I think that the best you can conclude
from your data is that Biomass probably increases with P.
Peter Ehlers
> Regards,
> Jinsong
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list