[R-sig-ME] Negative R-squared in linear regression: What's themeaning

Tue Apr 20 15:02:58 CEST 2010

I don't know if you got any reply off-list, but I didn't see any coming
through the mailing list, so...

The negative value is for the "adjusted" R-squared, and there's no
guarantee that that has to be non-negative.  In your case you only have
one degree of freedom for error, thus maximizing the chance for that
statistic to have a negative value.  R-squared is 1 - SSE/SStotal, and
indeed is equivalent to squared correlation between response and fitted
values, but adjusted R-squared is 1 - MSE/MStotal, and that need not be
positive, and does not correspond to squared correlation.

If you are fitting a model with 1 df for error, you need to learn the
danger of doing usual inference with such a model.

Andy 

> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org 
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf 
> Of Djibril Dayamba
> Sent: Friday, April 16, 2010 8:00 AM
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] Negative R-squared in linear regression: 
> What's themeaning
> 
> Dear all,
> I am trying to run a regression with two variables (Rainfaill 
> and Rainydays) as predictor of species Abundance (Abund). I 
> got a negative adjusted R-squared (-1.055) which is very 
> strange for me; indeed as far as I understand this statistics 
> could never be negative as it is supposed to be the squared 
> value of R.
> I would really appreciate if somebody could tell me why is it 
> so and what does this mean in practice (please see the output 
> from R below)?
> 
> Regards,
> 
> Sidzabda Djibril Dayamba, 
> Swedish University of Agricultural Sciences
> Faculty of Forest Science
> Southern Swedish Forest Research Centre
> Tropical Silviculture and Seed Laboratory
> PO Box 101
> SE - 230 53 Alnarp,
> Sweden
> Tel: +46 76 83 515 70 (Mobile)
>         +46 40 41 53 95 (Office)
> 
> 
> > mod1<-lm(Abund~Rain+Rdays)
> > summary(mod1)
> 
> Call:
> lm(formula = Abund ~ Rain + Rdays)
> 
> Residuals:
>       1       2       3       4 
>  0.8104 -3.4108  0.8480  1.7524 
> 
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept) 88.444166  27.319474   3.237    0.191
> Rain        -0.005913   0.012995  -0.455    0.728
> Rdays       -0.112197   0.605448  -0.185    0.883
> 
> Residual standard error: 4.01 on 1 degrees of freedom
> Multiple R-squared: 0.315,      Adjusted R-squared: -1.055 
> F-statistic:  0.23 on 2 and 1 DF,  p-value: 0.8276
> 
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org 
> [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf 
> Of r-sig-mixed-models-request at r-project.org
> Sent: den 16 april 2010 12:00
> To: r-sig-mixed-models at r-project.org
> Subject: R-sig-mixed-models Digest, Vol 40, Issue 33
> 
> Send R-sig-mixed-models mailing list submissions to
> 	r-sig-mixed-models at r-project.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> or, via email, send a message with subject or body 'help' to
> 	r-sig-mixed-models-request at r-project.org
> 
> You can reach the person managing the list at
> 	r-sig-mixed-models-owner at r-project.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of R-sig-mixed-models digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: profile() for lmer (Douglas Bates)
>    2. Re: Another case of -1.0 correlation of random effects
>       (Kevin E. Thorpe)
>    3. Re: MCMCglmm: meta-analysis problem (Gustaf Granath)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 15 Apr 2010 13:04:53 -0500
> From: Douglas Bates <bates at stat.wisc.edu>
> To: pierre nouvellet <pierrenouvellet at hotmail.com>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] profile() for lmer
> Message-ID:
> 	<l2q40e66e0b1004151104tcf945fd5oce48b0c3dafa11ba at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
> 
> On Thu, Apr 15, 2010 at 5:40 AM, pierre nouvellet
> <pierrenouvellet at hotmail.com> wrote:
> 
> > Thanks again to Rob!
> 
> > the call to 'profile(fm1ML at env)' ?worked perfectly!
> > it is really nice to see it working smoothly!!
> 
> Thanks to those who picked up on this.  Both Martin and I have been
> doing a lot of development on lme4a in the past couple of weeks,
> including fundamental changes to the internal representation.  We
> didn't mean to turn the process of determining the profiles into a
> treasure hunt - that just sort-of happened.
> 
> I should post a warning that the profile is likely to break again in
> the near future when I move in the wonderful new and improved class
> representation currently lurking around as a resurrected lmer2
> function.  We will get the profiling working again after we break it
> but there is a lot going on under the hood right now and some breakage
> is inevitable.
> 
> As others have discovered, lme4a now depends on the Rcpp package and,
> inadvertently, on an unreleased version.  That should be fixed soon.
> 
> Thanks for your patience.
> 
> 
> > pierre
> >
> >> Date: Thu, 15 Apr 2010 22:07:30 +1200
> >> From: p.taylor at niwa.co.nz
> >> To: pierrenouvellet at hotmail.com; r-sig-mixed-models at r-project.org
> >> Subject: Re: [R-sig-ME] profile() for lmer
> >>
> >> Hi Pierre,
> >>
> >> I'm having the same problem. I discovered a suggestion on 
> another mailing list (stat.ethz.ch/mailman/listinfo/r-help) 
> that using a delta argument would work ie 
> pr1ML<-profile(fm1ML,delta=0.2). However, I have tried that 
> and several smaller steps, but to no avail.
> >>
> >> I too am using lme4a and R-2.10.1 on windows 2007 professional.
> >>
> >> Any help from anyone would be greatly appreciated.
> >>
> >> Paul Taylor
> >> Pelagic Fisheries Scientist
> >> NIWA
> >> New Zealand.
> >>
> >> >>> pierre nouvellet 04/15/10 4:29 AM >>>
> >>
> >> Hi,Hi,
> >>
> >> >From the draft book on lme4:
> >> using lme4a, and the dyestuff data, I call profile and get:
> >>
> >> > profile(fm1ML)
> >>
> >> Error in UseMethod("profile") :
> >>
> >> no applicable method for 'profile' applied to an object of class
> >> "lmer"
> >>
> >>
> >> any suggestions?
> >> using a R-2.10.1 on windows vista
> >>
> >> pierre
> >>
> >>
> >> _________________________________________________________________
> >> Hotmail: Free, trusted and rich email service.
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >> NIWA is the trading name of the National Institute of 
> Water & Atmospheric Research Ltd.
> >
> > _________________________________________________________________
> > Hotmail: Powerful Free email with security by Microsoft.
> >
> > ? ? ? ?[[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 15 Apr 2010 16:43:57 -0400
> From: "Kevin E. Thorpe" <kevin.thorpe at utoronto.ca>
> To: "r-sig-mixed-models at r-project.org"
> 	<r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] Another case of -1.0 correlation of random
> 	effects
> Message-ID: <4BC77A8D.5010208 at utoronto.ca>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Kevin E. Thorpe wrote:
> > Ben Bolker wrote:
> >> Ken Knoblauch wrote:
> >>> Kevin E. Thorpe <kevin.thorpe at ...> writes:
> >>>
> >>>> My data come from a crossover trial and are balanced.
> >>>>
> >>>>  > str(gluc)
> >>>> 'data.frame':    96 obs. of  4 variables:
> >>>>   $ Subject  : int  1 2 3 5 6 7 10 11 12 13 ...
> >>>>   $ Treatment: Factor w/ 2 levels "Barley","Oat": 1 1 1 
> 1 1 1 1 1 1 
> >>>> 1 ...
> >>>>   $ Dose     : int  8 8 8 8 8 8 8 8 8 8 ...
> >>>>   $ iAUC     : num  110 256 129 207 244 ...
> >>>>
> >>>> clip>
> >>> Shouldn't you make Subject into a factor?
> >>>
> >>> Ken
> >>>
> >>
> >>   It would make the plot a little bit prettier but I don't think it
> >> matters in this case because variable that appears as a grouping
> >> variable (i.e. on the right of the | ) is automatically 
> treated as a
> >> factor?  I think?
> >>
> >>   Since it is really a crossover trial, it would seem reasonable in
> >> principle to have the (Treatment|Subject) random effect in there as
> >> well. I'm not sure what to do about the -1 correlation: it 
> seems the
> >> choices (not necessarily in order) are (1) throw up your 
> hands and say
> >> there's not enough data to estimate independently; (2) try WinBUGS,
> >> possibly with slightly informative priors; (3) try using 
> lme4a to create
> >> profiles of the parameters and see if you can figure out what's 
> >> happening.
> > 
> > Let's see.  I wish (1) was an option.  (2) would be promising if my 
> > knowledge of BUGS and Bayesian methods filled more than a thimble. 
> > Thanks to Jarrod for his suggestion in response to this.  
> I'll take a 
> > look at that too.  Option (3) is probably worth a go too.
> > 
> > Aside from the fact that the Dose variable are the actual 
> doses and not 
> > categories, and we all know not to categorize continuous 
> variables, what
> > are your thoughts on treating Dose as a factor (since it 
> seems to behave)?
> > 
> > Thanks all for taking the time to provide your suggestions.
> > 
> > Kevin
> > 
> Okay, I now have lme4a installed and I get an error message when I do 
> (note: this is the same model from my OP):
> 
>  > gluc.lmer1a <- 
> lmer(iAUC~Dose+(Dose|Subject),data=gluc,subset=Treatment=="Oat
> ",REML=FALSE)
> 
>  > gluc.lmer1a
> Linear mixed model fit by maximum likelihood  ['lmer']
> Formula: iAUC ~ Dose + (Dose | Subject)
>     Data: gluc
>   Subset: Treatment == "Oat"
>       AIC      BIC   logLik deviance
>     575.1    586.3   -281.6    563.1
> 
> Random effects:
>   Groups   Name        Variance Std.Dev. Corr
>   Subject  (Intercept) 7492.19  86.557
>            Dose          14.68   3.831   -1.000
>   Residual             4727.27  68.755
> Number of obs: 48, groups: Subject, 12
> 
> Fixed effects:
>              Estimate Std. Error t value
> (Intercept)  309.352     29.338  10.544
> Dose         -14.424      3.533  -4.083
> 
> Correlation of Fixed Effects:
>       (Intr)
> Dose -0.647
> 
>  > pr1 <- profile(gluc.lmer1a at env)  ## using @env base on 
> other threads
> Error in x[ndat + (1L:deg) - deg] :
>    only 0's may be mixed with negative subscripts
> 
> Is this because I'm trying to profile a model that profile() cannot 
> handle yet, or does it indicate there really are serious 
> problems with 
> my model?
> 
> I'm at a loss as to how determine what is really going on 
> with these data.
> 
> Kevin
> -- 
> Kevin E. Thorpe
> Biostatistician/Trialist, Knowledge Translation Program
> Assistant Professor, Dalla Lana School of Public Health
> University of Toronto
> email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.3016
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Fri, 16 Apr 2010 11:45:58 +0200
> From: Gustaf Granath <Gustaf.Granath at ebc.uu.se>
> To: Jarrod Hadfield <j.hadfield at ed.ac.uk>
> Cc: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] MCMCglmm: meta-analysis problem
> Message-ID: <4BC831D6.6000700 at ebc.uu.se>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> Excellent! Works and the results seem to be reasonable.
> 
> However, how would I add a hierarchical level on this?
> 
> If we have the model
> y=bX + tau2 + V (V=the known co-variance matrix), your code give me b 
> and tau2. But there might be another level of dependence in the data, 
> for example, research groups (so studies are nested in 
> research groups, 
> call it batches). This would lead co-variances within each research 
> group (i.e. batch). A common co-variance is ok (at least to 
> start with).
> So we get,
> y=bX + D + V, (D=tau2*I + hierarchical group blocks with the 
> co-variances)
> 
> I have problems to add this. I played around but nothing 
> really worked 
> out. I tried the rcov argument:
> testdata$Bacth<-Batch=c("a","a","a","b","b","b","c","c")
> 
> Ideas?
> 
> Cheers,
> 
> Gustaf
> 
> 
> 
> On 2010-04-15 18:02, Jarrod Hadfield wrote:
> > Hi Gustaf,
> >
> > It's not pretty,  but I believe it works:
> >
> > Rsvd<-svd(Rmat)
> > Rsvd<-Rsvd$v%*%(t(Rsvd$u)*sqrt(Rsvd$d))
> > Z<-model.matrix(~Experiment-1, testdata)%*%Rsvd
> >
> > testdata$Z<-Z
> >
> > prior=list(R=list(V=1, nu=0), G=list(G1=list(V=1, fix=1)))
> > m1<-MCMCglmm(y~1, random=~idv(Z), data=testdata, prior=prior)
> >
> >
> > It works because Z%*%t(Z)  = Rmat, and idv fits a common 
> variance to 
> > all terms in Z. Since we have fixed this variance to one then the 
> > random term is essentially fitting Rmat as a known 
> covariance structure.
> >
> > MCMCglmm always fits random effect meta-analysis, because 
> a) that is 
> > the way I wrote it and b) I think it is an odd assumption 
> to believe 
> > that with enough replication every experiment should 
> produce exactly 
> > the same answer.  Hence this model is fitting i.i.d residuals after 
> > accounting for (correlated) measurement error.
> >
> > Cheers,
> >
> > Jarrod
> >
> >
> >
> >
> >
> >
> > On 15 Apr 2010, at 12:50, Gustaf Granath wrote:
> >
> >> Hi,
> >>
> >> In a meta-analysis, the SEs of the outcomes are known. However in 
> >> some cases, the correlation (co-variances) among outcomes 
> are known 
> >> as well. For example when you have multiple outcomes from 
> one study. 
> >> I wanted to see if the whole error structure of measurement errors 
> >> (the R-structure in MCMCglmm, or?) can be passed and not only the 
> >> diagonal.
> >>
> >> ##test data
> >> 
> testdata<-data.frame(Experiment=as.factor(c(1,2,3,4,5,6,7,8)),
Study=c("a","a","b","c","d","d","e","e") 
> >>
> >> 
> ,y=c(34,38,45,48,35,28,43,39),yvar=c(3,5,6,2,3,4,5,7),covar=c(
> 1.5,1.5,NA,NA,2.5,2.5,1.25,1.25)) 
> >>
> >> Rmat<-diag(8)*testdata$yvar
> >> Rmat[1,2]<-testdata$covar[1]
> >> Rmat[2,1]<-testdata$covar[2]
> >> Rmat[5,6]<-testdata$covar[5]
> >> Rmat[6,5]<-testdata$covar[6]
> >> Rmat[7,8]<-testdata$covar[7]
> >> Rmat[8,7]<-testdata$covar[8]
> >>
> >> Rmat is the known covariance structure of the experimental 
> outcomes.
> >>
> >> library(MCMCglmm)
> >> #"normal" meta-analysis using only the diagonal:
> >> prior = list(R = list(V = 1, n=1,fix = 1),G = list(G1  = 
> list ( V = 
> >> 1, n = 1)))
> >> model1 <- MCMCglmm(y ~ 1, random = ~idh(sqrt(yvar)):units ,data = 
> >> testdata,
> >>                     prior=prior)
> >>
> >> #OR (should be equal right? give sligtly different results though):
> >> model2 <- MCMCglmm(y ~ 1, random = ~Experiment ,data = testdata,
> >>                     mev=testdata$yvar,prior=prior)
> >>
> >> #Now I want to include the known within-study correlation.
> >> prior = list(R = list(V = Rmat, fix = 1),G = list(G1  = list ( V = 
> >> (1), nu = 0.002)))
> >> model1 <- MCMCglmm(y ~ 1, random = ~idh(Experiment):units, rcov = ~
> >> us(Study):Experiment ,data = testdata,
> >>                     prior=prior)
> >>
> >> This did not work (I tried other ways as well but all 
> failed) and I 
> >> guess that it is because of my R-structure prior. Is there an 
> >> alternative specification, or? (I played around a little with the 
> >> "animal" argument but I couldnt get to do what I wanted.)
> >>
> >> Cheers,
> >>
> >> Gustaf
> >>
> >> -- 
> >>
> >> Gustaf Granath (PhD student)
> >>
> >> Dept of Plant Ecology
> >> Uppsala University
> >>
> >>
> >>
> >
> >
> 
> -- 
> Gustaf Granath (PhD student)
> 
> Dept of Plant Ecology
> Evolutionary Biology Centre (EBC)
> Uppsala University
> Norbyv?gen 18D, SE - 752 36 Uppsala, Sweden
> 
> Tel: +46 (0)18-471 28 82
> Email: Gustaf.Granath at ebc.uu.se
> http://www.vaxtbio.uu.se/resfold/granath.htm
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> R-sig-mixed-models mailing list
> R-sig-mixed-models at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> End of R-sig-mixed-models Digest, Vol 40, Issue 33
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
Notice:  This e-mail message, together with any attachme...{{dropped:10}}