[R-sig-ME] Negative R-squared in linear regression: What's themeaning
Liaw, Andy
andy_liaw at merck.com
Tue Apr 20 15:02:58 CEST 2010
I don't know if you got any reply off-list, but I didn't see any coming
through the mailing list, so...
The negative value is for the "adjusted" R-squared, and there's no
guarantee that that has to be non-negative. In your case you only have
one degree of freedom for error, thus maximizing the chance for that
statistic to have a negative value. R-squared is 1 - SSE/SStotal, and
indeed is equivalent to squared correlation between response and fitted
values, but adjusted R-squared is 1 - MSE/MStotal, and that need not be
positive, and does not correspond to squared correlation.
If you are fitting a model with 1 df for error, you need to learn the
danger of doing usual inference with such a model.
Andy
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf
> Of Djibril Dayamba
> Sent: Friday, April 16, 2010 8:00 AM
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Negative R-squared in linear regression:
> What's themeaning
>
> Dear all,
> I am trying to run a regression with two variables (Rainfaill
> and Rainydays) as predictor of species Abundance (Abund). I
> got a negative adjusted R-squared (-1.055) which is very
> strange for me; indeed as far as I understand this statistics
> could never be negative as it is supposed to be the squared
> value of R.
> I would really appreciate if somebody could tell me why is it
> so and what does this mean in practice (please see the output
> from R below)?
>
> Regards,
>
> Sidzabda Djibril Dayamba,
> Swedish University of Agricultural Sciences
> Faculty of Forest Science
> Southern Swedish Forest Research Centre
> Tropical Silviculture and Seed Laboratory
> PO Box 101
> SE - 230 53 Alnarp,
> Sweden
> Tel: +46 76 83 515 70 (Mobile)
> +46 40 41 53 95 (Office)
>
>
> > mod1<-lm(Abund~Rain+Rdays)
> > summary(mod1)
>
> Call:
> lm(formula = Abund ~ Rain + Rdays)
>
> Residuals:
> 1 2 3 4
> 0.8104 -3.4108 0.8480 1.7524
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 88.444166 27.319474 3.237 0.191
> Rain -0.005913 0.012995 -0.455 0.728
> Rdays -0.112197 0.605448 -0.185 0.883
>
> Residual standard error: 4.01 on 1 degrees of freedom
> Multiple R-squared: 0.315, Adjusted R-squared: -1.055
> F-statistic: 0.23 on 2 and 1 DF, p-value: 0.8276
>
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf
> Of r-sig-mixed-models-request at r-project.org
> Sent: den 16 april 2010 12:00
> To: r-sig-mixed-models at r-project.org
> Subject: R-sig-mixed-models Digest, Vol 40, Issue 33
>
> Send R-sig-mixed-models mailing list submissions to
> r-sig-mixed-models at r-project.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> or, via email, send a message with subject or body 'help' to
> r-sig-mixed-models-request at r-project.org
>
> You can reach the person managing the list at
> r-sig-mixed-models-owner at r-project.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-sig-mixed-models digest..."
>
>
> Today's Topics:
>
> 1. Re: profile() for lmer (Douglas Bates)
> 2. Re: Another case of -1.0 correlation of random effects
> (Kevin E. Thorpe)
> 3. Re: MCMCglmm: meta-analysis problem (Gustaf Granath)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 15 Apr 2010 13:04:53 -0500
> From: Douglas Bates <bates at stat.wisc.edu>
> To: pierre nouvellet <pierrenouvellet at hotmail.com>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] profile() for lmer
> Message-ID:
> <l2q40e66e0b1004151104tcf945fd5oce48b0c3dafa11ba at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
> On Thu, Apr 15, 2010 at 5:40 AM, pierre nouvellet
> <pierrenouvellet at hotmail.com> wrote:
>
> > Thanks again to Rob!
>
> > the call to 'profile(fm1ML at env)' ?worked perfectly!
> > it is really nice to see it working smoothly!!
>
> Thanks to those who picked up on this. Both Martin and I have been
> doing a lot of development on lme4a in the past couple of weeks,
> including fundamental changes to the internal representation. We
> didn't mean to turn the process of determining the profiles into a
> treasure hunt - that just sort-of happened.
>
> I should post a warning that the profile is likely to break again in
> the near future when I move in the wonderful new and improved class
> representation currently lurking around as a resurrected lmer2
> function. We will get the profiling working again after we break it
> but there is a lot going on under the hood right now and some breakage
> is inevitable.
>
> As others have discovered, lme4a now depends on the Rcpp package and,
> inadvertently, on an unreleased version. That should be fixed soon.
>
> Thanks for your patience.
>
>
> > pierre
> >
> >> Date: Thu, 15 Apr 2010 22:07:30 +1200
> >> From: p.taylor at niwa.co.nz
> >> To: pierrenouvellet at hotmail.com; r-sig-mixed-models at r-project.org
> >> Subject: Re: [R-sig-ME] profile() for lmer
> >>
> >> Hi Pierre,
> >>
> >> I'm having the same problem. I discovered a suggestion on
> another mailing list (stat.ethz.ch/mailman/listinfo/r-help)
> that using a delta argument would work ie
> pr1ML<-profile(fm1ML,delta=0.2). However, I have tried that
> and several smaller steps, but to no avail.
> >>
> >> I too am using lme4a and R-2.10.1 on windows 2007 professional.
> >>
> >> Any help from anyone would be greatly appreciated.
> >>
> >> Paul Taylor
> >> Pelagic Fisheries Scientist
> >> NIWA
> >> New Zealand.
> >>
> >> >>> pierre nouvellet 04/15/10 4:29 AM >>>
> >>
> >> Hi,Hi,
> >>
> >> >From the draft book on lme4:
> >> using lme4a, and the dyestuff data, I call profile and get:
> >>
> >> > profile(fm1ML)
> >>
> >> Error in UseMethod("profile") :
> >>
> >> no applicable method for 'profile' applied to an object of class
> >> "lmer"
> >>
> >>
> >> any suggestions?
> >> using a R-2.10.1 on windows vista
> >>
> >> pierre
> >>
> >>
> >> _________________________________________________________________
> >> Hotmail: Free, trusted and rich email service.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> NIWA is the trading name of the National Institute of
> Water & Atmospheric Research Ltd.
> >
> > _________________________________________________________________
> > Hotmail: Powerful Free email with security by Microsoft.
> >
> > ? ? ? ?[[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 15 Apr 2010 16:43:57 -0400
> From: "Kevin E. Thorpe" <kevin.thorpe at utoronto.ca>
> To: "r-sig-mixed-models at r-project.org"
> <r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] Another case of -1.0 correlation of random
> effects
> Message-ID: <4BC77A8D.5010208 at utoronto.ca>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Kevin E. Thorpe wrote:
> > Ben Bolker wrote:
> >> Ken Knoblauch wrote:
> >>> Kevin E. Thorpe <kevin.thorpe at ...> writes:
> >>>
> >>>> My data come from a crossover trial and are balanced.
> >>>>
> >>>> > str(gluc)
> >>>> 'data.frame': 96 obs. of 4 variables:
> >>>> $ Subject : int 1 2 3 5 6 7 10 11 12 13 ...
> >>>> $ Treatment: Factor w/ 2 levels "Barley","Oat": 1 1 1
> 1 1 1 1 1 1
> >>>> 1 ...
> >>>> $ Dose : int 8 8 8 8 8 8 8 8 8 8 ...
> >>>> $ iAUC : num 110 256 129 207 244 ...
> >>>>
> >>>> clip>
> >>> Shouldn't you make Subject into a factor?
> >>>
> >>> Ken
> >>>
> >>
> >> It would make the plot a little bit prettier but I don't think it
> >> matters in this case because variable that appears as a grouping
> >> variable (i.e. on the right of the | ) is automatically
> treated as a
> >> factor? I think?
> >>
> >> Since it is really a crossover trial, it would seem reasonable in
> >> principle to have the (Treatment|Subject) random effect in there as
> >> well. I'm not sure what to do about the -1 correlation: it
> seems the
> >> choices (not necessarily in order) are (1) throw up your
> hands and say
> >> there's not enough data to estimate independently; (2) try WinBUGS,
> >> possibly with slightly informative priors; (3) try using
> lme4a to create
> >> profiles of the parameters and see if you can figure out what's
> >> happening.
> >
> > Let's see. I wish (1) was an option. (2) would be promising if my
> > knowledge of BUGS and Bayesian methods filled more than a thimble.
> > Thanks to Jarrod for his suggestion in response to this.
> I'll take a
> > look at that too. Option (3) is probably worth a go too.
> >
> > Aside from the fact that the Dose variable are the actual
> doses and not
> > categories, and we all know not to categorize continuous
> variables, what
> > are your thoughts on treating Dose as a factor (since it
> seems to behave)?
> >
> > Thanks all for taking the time to provide your suggestions.
> >
> > Kevin
> >
> Okay, I now have lme4a installed and I get an error message when I do
> (note: this is the same model from my OP):
>
> > gluc.lmer1a <-
> lmer(iAUC~Dose+(Dose|Subject),data=gluc,subset=Treatment=="Oat
> ",REML=FALSE)
>
> > gluc.lmer1a
> Linear mixed model fit by maximum likelihood ['lmer']
> Formula: iAUC ~ Dose + (Dose | Subject)
> Data: gluc
> Subset: Treatment == "Oat"
> AIC BIC logLik deviance
> 575.1 586.3 -281.6 563.1
>
> Random effects:
> Groups Name Variance Std.Dev. Corr
> Subject (Intercept) 7492.19 86.557
> Dose 14.68 3.831 -1.000
> Residual 4727.27 68.755
> Number of obs: 48, groups: Subject, 12
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 309.352 29.338 10.544
> Dose -14.424 3.533 -4.083
>
> Correlation of Fixed Effects:
> (Intr)
> Dose -0.647
>
> > pr1 <- profile(gluc.lmer1a at env) ## using @env base on
> other threads
> Error in x[ndat + (1L:deg) - deg] :
> only 0's may be mixed with negative subscripts
>
> Is this because I'm trying to profile a model that profile() cannot
> handle yet, or does it indicate there really are serious
> problems with
> my model?
>
> I'm at a loss as to how determine what is really going on
> with these data.
>
> Kevin
> --
> Kevin E. Thorpe
> Biostatistician/Trialist, Knowledge Translation Program
> Assistant Professor, Dalla Lana School of Public Health
> University of Toronto
> email: kevin.thorpe at utoronto.ca Tel: 416.864.5776 Fax: 416.864.3016
>
>
>
> ------------------------------
>
> Message: 3
> Date: Fri, 16 Apr 2010 11:45:58 +0200
> From: Gustaf Granath <Gustaf.Granath at ebc.uu.se>
> To: Jarrod Hadfield <j.hadfield at ed.ac.uk>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] MCMCglmm: meta-analysis problem
> Message-ID: <4BC831D6.6000700 at ebc.uu.se>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Excellent! Works and the results seem to be reasonable.
>
> However, how would I add a hierarchical level on this?
>
> If we have the model
> y=bX + tau2 + V (V=the known co-variance matrix), your code give me b
> and tau2. But there might be another level of dependence in the data,
> for example, research groups (so studies are nested in
> research groups,
> call it batches). This would lead co-variances within each research
> group (i.e. batch). A common co-variance is ok (at least to
> start with).
> So we get,
> y=bX + D + V, (D=tau2*I + hierarchical group blocks with the
> co-variances)
>
> I have problems to add this. I played around but nothing
> really worked
> out. I tried the rcov argument:
> testdata$Bacth<-Batch=c("a","a","a","b","b","b","c","c")
>
> Ideas?
>
> Cheers,
>
> Gustaf
>
>
>
> On 2010-04-15 18:02, Jarrod Hadfield wrote:
> > Hi Gustaf,
> >
> > It's not pretty, but I believe it works:
> >
> > Rsvd<-svd(Rmat)
> > Rsvd<-Rsvd$v%*%(t(Rsvd$u)*sqrt(Rsvd$d))
> > Z<-model.matrix(~Experiment-1, testdata)%*%Rsvd
> >
> > testdata$Z<-Z
> >
> > prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, fix=1)))
> > m1<-MCMCglmm(y~1, random=~idv(Z), data=testdata, prior=prior)
> >
> >
> > It works because Z%*%t(Z) = Rmat, and idv fits a common
> variance to
> > all terms in Z. Since we have fixed this variance to one then the
> > random term is essentially fitting Rmat as a known
> covariance structure.
> >
> > MCMCglmm always fits random effect meta-analysis, because
> a) that is
> > the way I wrote it and b) I think it is an odd assumption
> to believe
> > that with enough replication every experiment should
> produce exactly
> > the same answer. Hence this model is fitting i.i.d residuals after
> > accounting for (correlated) measurement error.
> >
> > Cheers,
> >
> > Jarrod
> >
> >
> >
> >
> >
> >
> > On 15 Apr 2010, at 12:50, Gustaf Granath wrote:
> >
> >> Hi,
> >>
> >> In a meta-analysis, the SEs of the outcomes are known. However in
> >> some cases, the correlation (co-variances) among outcomes
> are known
> >> as well. For example when you have multiple outcomes from
> one study.
> >> I wanted to see if the whole error structure of measurement errors
> >> (the R-structure in MCMCglmm, or?) can be passed and not only the
> >> diagonal.
> >>
> >> ##test data
> >>
> testdata<-data.frame(Experiment=as.factor(c(1,2,3,4,5,6,7,8)),
Study=c("a","a","b","c","d","d","e","e")
> >>
> >>
> ,y=c(34,38,45,48,35,28,43,39),yvar=c(3,5,6,2,3,4,5,7),covar=c(
> 1.5,1.5,NA,NA,2.5,2.5,1.25,1.25))
> >>
> >> Rmat<-diag(8)*testdata$yvar
> >> Rmat[1,2]<-testdata$covar[1]
> >> Rmat[2,1]<-testdata$covar[2]
> >> Rmat[5,6]<-testdata$covar[5]
> >> Rmat[6,5]<-testdata$covar[6]
> >> Rmat[7,8]<-testdata$covar[7]
> >> Rmat[8,7]<-testdata$covar[8]
> >>
> >> Rmat is the known covariance structure of the experimental
> outcomes.
> >>
> >> library(MCMCglmm)
> >> #"normal" meta-analysis using only the diagonal:
> >> prior = list(R = list(V = 1, n=1,fix = 1),G = list(G1 =
> list ( V =
> >> 1, n = 1)))
> >> model1 <- MCMCglmm(y ~ 1, random = ~idh(sqrt(yvar)):units ,data =
> >> testdata,
> >> prior=prior)
> >>
> >> #OR (should be equal right? give sligtly different results though):
> >> model2 <- MCMCglmm(y ~ 1, random = ~Experiment ,data = testdata,
> >> mev=testdata$yvar,prior=prior)
> >>
> >> #Now I want to include the known within-study correlation.
> >> prior = list(R = list(V = Rmat, fix = 1),G = list(G1 = list ( V =
> >> (1), nu = 0.002)))
> >> model1 <- MCMCglmm(y ~ 1, random = ~idh(Experiment):units, rcov = ~
> >> us(Study):Experiment ,data = testdata,
> >> prior=prior)
> >>
> >> This did not work (I tried other ways as well but all
> failed) and I
> >> guess that it is because of my R-structure prior. Is there an
> >> alternative specification, or? (I played around a little with the
> >> "animal" argument but I couldnt get to do what I wanted.)
> >>
> >> Cheers,
> >>
> >> Gustaf
> >>
> >> --
> >>
> >> Gustaf Granath (PhD student)
> >>
> >> Dept of Plant Ecology
> >> Uppsala University
> >>
> >>
> >>
> >
> >
>
> --
> Gustaf Granath (PhD student)
>
> Dept of Plant Ecology
> Evolutionary Biology Centre (EBC)
> Uppsala University
> Norbyv?gen 18D, SE - 752 36 Uppsala, Sweden
>
> Tel: +46 (0)18-471 28 82
> Email: Gustaf.Granath at ebc.uu.se
> http://www.vaxtbio.uu.se/resfold/granath.htm
>
>
>
> ------------------------------
>
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>
> End of R-sig-mixed-models Digest, Vol 40, Issue 33
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
Notice: This e-mail message, together with any attachme...{{dropped:10}}
More information about the R-sig-mixed-models
mailing list