[R] Regression Testing

Achim Zeileis Achim.Zeileis at uibk.ac.at
Fri Jan 21 15:13:27 CET 2011


On Fri, 21 Jan 2011, Mojo wrote:

> On 1/20/2011 4:42 PM, Achim Zeileis wrote:
>> On Thu, 20 Jan 2011, Mojo wrote:
>> 
>>> I'm new to R and some what new to the world of stats.  I got frustrated 
>>> with excel and found R.  Enough of that already.
>>> 
>>> I'm trying to test and correct for Heteroskedasticity
>>> 
>>> I have data in a csv file that I load and store in a dataframe.
>>> 
>>>> ds <- read.csv("book2.csv")
>>>> df <- data.frame(ds)
>>> 
>>> I then preform a OLS regression:
>>> 
>>>> lmfit <- lm(df$y~df$x)
>> 
>> Just btw: lm(y ~ x, data = df) is somewhat easier to read and also easier 
>> to write when the formula involves more regressors.
>> 
>>> To test for Heteroskedasticity, I run the BPtest:
>>> 
>>>> bptest(lmfit)
>>>
>>>        studentized Breusch-Pagan test
>>> 
>>> data:  lmfit
>>> BP = 11.6768, df = 1, p-value = 0.0006329
>>> 
>>> From the above, if I'm interpreting this correctly, there is 
>>> Heteroskedasticity present.  To correct for this, I need to calculate 
>>> robust error terms.
>> 
>> That is one option. Another one would be using WLS instead of OLS - or 
>> maybe FGLS. As the model just has one regressor, this might be possible and 
>> result in a more efficient estimate than OLS.
>
> I thought that WLS (which I guessing is a weighted regression) is really only 
> useful when you know or at least have an idea of what is causing the 
> Heteroskedasticity?

Yes. But with only a single variable that shouldn't be too hard to do. 
Also in the Breusch-Pagan test you specify a hypothesized functional form 
for the variance.

> I'm not familiar with FGLS.

There is a worked example in

   demo("Ch-LinearRegression", package = "AER")

The corresponding book has some more details.

hth,
Z

> I plan on adding additional 
> independent variables as I get more comfortable with everything.
>
>> 
>>> From my reading on this list, it seems like I need to vcovHC.
>> 
>> That's another option, yes.
>> 
>>>> vcovHC(lmfit)
>>>              (Intercept)         df$x
>>> (Intercept)  1.057460e-03 -4.961118e-05
>>> df$x       -4.961118e-05  2.378465e-06
>>> 
>>> I'm having a little bit of a hard time following the help pages.
>> 
>> Yes, the manual page is somewhat technical but the first thing the 
>> "Details" section does is: It points you to some references that should be 
>> easier to read. I recommend starting with
>>
>>      Zeileis A (2004), Econometric Computing with HC and HAC Covariance
>>      Matrix Estimators. _Journal of Statistical Software_, *11*(10),
>>      1-17. URL <URL: http://www.jstatsoft.org/v11/i10/>.
>
> I will look into that.
>
> Thanks,
> Mojo
>
>



More information about the R-help mailing list