[R] Regression Testing
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Fri Jan 21 15:13:27 CET 2011
On Fri, 21 Jan 2011, Mojo wrote:
> On 1/20/2011 4:42 PM, Achim Zeileis wrote:
>> On Thu, 20 Jan 2011, Mojo wrote:
>>
>>> I'm new to R and some what new to the world of stats. I got frustrated
>>> with excel and found R. Enough of that already.
>>>
>>> I'm trying to test and correct for Heteroskedasticity
>>>
>>> I have data in a csv file that I load and store in a dataframe.
>>>
>>>> ds <- read.csv("book2.csv")
>>>> df <- data.frame(ds)
>>>
>>> I then preform a OLS regression:
>>>
>>>> lmfit <- lm(df$y~df$x)
>>
>> Just btw: lm(y ~ x, data = df) is somewhat easier to read and also easier
>> to write when the formula involves more regressors.
>>
>>> To test for Heteroskedasticity, I run the BPtest:
>>>
>>>> bptest(lmfit)
>>>
>>> studentized Breusch-Pagan test
>>>
>>> data: lmfit
>>> BP = 11.6768, df = 1, p-value = 0.0006329
>>>
>>> From the above, if I'm interpreting this correctly, there is
>>> Heteroskedasticity present. To correct for this, I need to calculate
>>> robust error terms.
>>
>> That is one option. Another one would be using WLS instead of OLS - or
>> maybe FGLS. As the model just has one regressor, this might be possible and
>> result in a more efficient estimate than OLS.
>
> I thought that WLS (which I guessing is a weighted regression) is really only
> useful when you know or at least have an idea of what is causing the
> Heteroskedasticity?
Yes. But with only a single variable that shouldn't be too hard to do.
Also in the Breusch-Pagan test you specify a hypothesized functional form
for the variance.
> I'm not familiar with FGLS.
There is a worked example in
demo("Ch-LinearRegression", package = "AER")
The corresponding book has some more details.
hth,
Z
> I plan on adding additional
> independent variables as I get more comfortable with everything.
>
>>
>>> From my reading on this list, it seems like I need to vcovHC.
>>
>> That's another option, yes.
>>
>>>> vcovHC(lmfit)
>>> (Intercept) df$x
>>> (Intercept) 1.057460e-03 -4.961118e-05
>>> df$x -4.961118e-05 2.378465e-06
>>>
>>> I'm having a little bit of a hard time following the help pages.
>>
>> Yes, the manual page is somewhat technical but the first thing the
>> "Details" section does is: It points you to some references that should be
>> easier to read. I recommend starting with
>>
>> Zeileis A (2004), Econometric Computing with HC and HAC Covariance
>> Matrix Estimators. _Journal of Statistical Software_, *11*(10),
>> 1-17. URL <URL: http://www.jstatsoft.org/v11/i10/>.
>
> I will look into that.
>
> Thanks,
> Mojo
>
>
More information about the R-help
mailing list