[R-sig-ME] logLik

Wed Dec 17 15:13:31 CET 2008

  (previous post bounced due to GPG wrapper)

Daniel Ezra Johnson wrote:
> Hi,
> 
> This is from the help file for logLik():
> 
>> x <- 1:5
>> lmx <- lm(x ~ 1)
>> logLik(lmx)
> 
> log Lik.' -8.82756 (df=2)
> 
> Two questions:
> 1) doesn't the model lmx have one degree of freedom, not two?

  If you count the implicit variance, it has two.

> 2) how is this log-likelihood calculated?

see stats:::logLik.lm : the core is

  val <- 0.5 * (sum(log(w)) - N * (log(2 * pi) + 1 - log(N) +
        log(sum(w * res^2))))

where res are residuals, w are weights, N is the number of points

> 
> If I have two nested linear models (say lm models, not worrying about
> mixed models here), I know how to compare them using an F-test, but I
> don't understand the difference (if there is one) between using an
> F-test and using a likelihood-ratio test.

  The likelihood ratio test is asymptotic, so you should use an F test
if you're in a situation where it's appropriate.

x <- rnorm(100)
 > y <- 1+2*x+rnorm(100,sd=1)
> lm2 <- lm(y~x)
> lm1 <- lm(y~1)
> anova(lm2,lm1)
Analysis of Variance Table

Model 1: y ~ x
Model 2: y ~ 1
  Res.Df     RSS Df Sum of Sq      F    Pr(>F)
1     98   94.16
2     99  492.44 -1   -398.29 414.54 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

  str() shows that the actual p-value is 5.54e-37

> pchisq(2*(logLik(lm2)-logLik(lm1)),1,lower.tail=FALSE)
[1] 7.325424e-38
attr(,"nall")
[1] 100
attr(,"nobs")
[1] 100
attr(,"df")
[1] 3
attr(,"class")
[1] "logLik"

-- 
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc