[R] maximum likelihood accuracy - comparison with Stata

Mon Mar 28 21:00:54 CEST 2011

Are you sure that 1.78 is not the estimate of sigma and 3.14 the
estimate of sigma^2.

Best Regards

John

On Monday, 28 March 2011, Peter Ehlers <ehlers at ucalgary.ca> wrote:
> On 2011-03-27 21:37, Alex Olssen wrote:
>
> Hi everyone,
>
> I am looking to do some manual maximum likelihood estimation in R.  I
> have done a lot of work in Stata and so I have been using output
> comparisons to get a handle on what is happening.
>
> I estimated a simple linear model in R with   lm()   and also my own
> maximum likelihood program.  I then compared the output with Stata.
> Two things jumped out at me.
>
> Firstly, in Stata my coefficient estimates are identical in both the
> OLS estimation by   -reg-  and the maximum likelihood estimation using
> Stata's   ml lf  command.
> In R my coefficient estimates differ slightly between   lm()   and my
> own maximum likelihood estimation.
>
> Secondly, the estimates for   sigma2   are very different between R
> and Stata.  3.14 in R compared to 1.78 in Stata.
>
> I have copied my maximum likelihood program below.  It is copied from
> a great intro to MLE in R by Macro Steenbergen
> http://artsci.wustl.edu/~jmonogan/computing/r/MLE_in_R.pdf
>
> Any comments are welcome.  In particular I would like to know why the
> estimate of   sigma2   is so different.  I would also like to know
> about the accuracy of the coefficient estimates.
>
>
> Some comments:
>
> 1.
> Since Kmenta is not in the datasets package, you should mention
> where you're getting it. (I suppose that for economists, it's
> obvious, but we're not all economists.) I used the version in
> the systemfit package.
>
> 2.
> I don't believe your Stata value for sigma2 (by which I assume
> you mean sigma^2). Are you quoting sigma?
>
> 3.
> I can't reproduce your R value of 3.14 for sigma2. I get 3.725391.
>
> 4.
> (most important) There's a difference between the simple ML
> estimate (which is biased) and R's unbiased estimate for sigma^2.
> This dataset has 20 cases, so try multiplying your result by 20/17.
>
> 5.
> As to any difference in coefficients, I would guess that the
> difference is slight (you don't show what it is); it
> may be due to the fact that optim() produces approximate
> solutions, whereas in this simple case, an 'exact' solution
> is possible (and found by R).
>
> 6.
> In case you weren't aware of it: the stats4 package has an
> mle() function.
>
> Peter Ehlers
>
>
>
> ## ols
> ols<- lm(Kmenta$consump ~ Kmenta$price + Kmenta$income)
> coef(summary(ols))
>
> ## mle
> y<- matrix(Kmenta$consump)
> x<- cbind(1, Kmenta$price, Kmenta$income)
> ols.lf<- function(theta, y, x) {
>    N<- nrow(y)
>    K<- ncol(x)
>    beta<- theta[1:K]
>    sigma2<- theta[K+1]
>    e<- y - x%*%beta
>    logl<- -0.5*N*log(2*pi)-0.5*N*log(sigma2)-((t(e)%*%e)/(2*sigma2))
>    return(-logl)
> }
> p<- optim(c(0,0,0,2), ols.lf, method="BFGS", hessian=T, y=y, x=x)
> OI<- solve(p$hessian)
> se<- sqrt(diag(OI))
> se<- se[1:3]
> beta<- p$par[1:3]
> results<- cbind(beta, se)
> results
> sigma2<- p$par[4]
> sigma2
>
> Kind regards,
>
> Alex Olssen
> Motu Research
> Wellington
> New Zealand
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
John C Frain
Economics Department
Trinity College Dublin
Dublin 2
Ireland
www.tcd.ie/Economics/staff/frainj/home.html
mailto:frainj at tcd.ie
mailto:frainj at gmail.com