[R] statistics - hypothesis testing question

Thu Sep 13 20:39:27 CEST 2007

you're right Duncan. My bad. That was kind of dopey because R squared is
a statistic in itself.  They aren't nested models because
the two predictors are different and there are no other predictors.  I'm
trying  to see whether the model B predictor is "better" than  the
Model A predictor. I guess how one defines "better" is th real question
so I apologize for that. Still, any comments, suggestions are welcome.

-----Original Message-----
From: Duncan Murdoch [mailto:murdoch at stats.uwo.ca] 
Sent: Thursday, September 13, 2007 2:32 PM
To: Leeds, Mark (IED)
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] statistics - hypothesis testing question

On 9/13/2007 2:18 PM, Leeds, Mark (IED) wrote:
> I estimate two competing simple regression models, A and B where the 
> LHS is the same in both cases but the predictor is different ( I 
> handle the intercept issue based on other postings I have seen ). I 
> estimate the two models on a weekly basis over 24 weeks.
> So, I end up with 24 RSquaredAs and 24 RsquaredBs, so essentally 2 
> time series of Rsquareds. This doesn't have to be necessarily thought 
> of as a time series problem but, is there a usual way, given the 
> Rsquared data, to test
> 
> H0 : Rsquared B = Rsquared A versus H1 : Rsquared B > Rsquared A
> 
> so that I can map the 24 R squared numbers into 1 statistic. Maybe 
> that's somehow equivalent to just running 2 big regressions over the 
> whole 24 weeks and then calculating a statistic from those based on 
> those regressions ?

The question doesn't make sense, if you're using standard notation.  R^2
is a statistic, not a parameter, so one wouldn't test copies of it for
equality.

You can probably reframe the question in terms of E(R^2) so the
statement parses, but then it doesn't really make sense from a subject
matter point of view:  unless model A is nested within model B, why
would you ever expect the two fits to explain exactly the same amount of
variation?

If model A is really a special case of model B, then you're back to the
standard hypothesis testing situation, but repeated 24 times.  There's a
lot of literature on how to handle such multiple testing problems,
depending on what sort of alternatives you want to detect.  (E.g. do you
think all 24 cases will be identical, or is it possible that 23 will
match but one doesn't?)

Duncan Murdoch

> 
> I broke things up into 24 weeks because I was thinking that the 
> stability of the performance difference of the two models could be 
> examined over time. Essentially these are simple time series 
> regressions X_t = B*X_t-1 + epsilon so I always need to consider 
> whether any type of behavior is stable.  But now I am thinking that,  
> if I just want one overall number,  then maybe I should be considering

> all the data simultaneously ?
> 
> In a nutshell,  I am looking for any suggestions on the best way to 
> test whether Model B is better than Model A where
> 
> Model A :  X_t = Beta*X_t-1 + epsilon
> 
> Model B :  X_t = Betastar*Xstar_t-1 + epsilonstar
> 
> 
> Thanks fo your help.
> --------------------------------------------------------
> 
> This is not an offer (or solicitation of an offer) to 
> buy/se...{{dropped}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--------------------------------------------------------

This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}