[R] glm expand model to more values
Jarek Jasiewicz
jarekj at amu.edu.pl
Sat Jan 12 21:48:22 CET 2008
Charles Annis, P.E. wrote:
> Jarek:
>
> Although it is not universally agreed on, I believe the first step in any
> data analysis is to PLOT YOUR DATA.
>
> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
> plot(b ~ a, data=dd)
> simple.model <- lm(b~a,data=dd)
> abline(simple.model)
>
> Why to you think you need a cubic model to describe 6 observations?
>
> Your model is overparameterized - it has two more parameters than the number
> of observations can reasonably justify, something that would be obvious from
> your plot.
>
> The summary of the simple.linear model shows both the intercept and the
> slope are statistically meaningful. (That's what the asterisks mean.)
>
> Call:
> lm(formula = b ~ a, data = dd)
>
> Residuals:
> 1 2 3 4 5 6
> -0.23810 0.39048 0.01905 -0.35238 0.27619 -0.09524
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 1.86667 0.30132 6.195 0.00345 **
> a 1.37143 0.07737 17.725 5.95e-05 ***
> ---
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
> Residual standard error: 0.3237 on 4 degrees of freedom
> Multiple R-Squared: 0.9874, Adjusted R-squared: 0.9843
> F-statistic: 314.2 on 1 and 4 DF, p-value: 5.952e-05
>
> I think you should invest a small amount of your time, and an even smaller
> amount of your money to purchase and read - cover-to-cover - one of the
> several very good books on elementary statistics and R. My recommendation
> is _Introductory Statistics with R_ by Peter Dalgaard (Paperback - Jan 9,
> 2004). Amazon.com carries it.
>
> Best wishes.
>
>
>
> Charles Annis, P.E.
>
> Charles.Annis at StatisticalEngineering.com
> phone: 561-352-9699
> eFax: 614-455-3265
> http://www.StatisticalEngineering.com
>
>
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
> Behalf Of Jarek Jasiewicz
> Sent: Saturday, January 12, 2008 2:06 PM
> To: Charles.Annis at statisticalengineering.com
> Cc: R-help at r-project.org
> Subject: Re: [R] glm expand model to more values
>
> Charles Annis, P.E. wrote:
>
>> How many parameters are you trying to estimate? How many observations do
>> you have?
>>
>> What is wrong is that half of your parameter estimates are statistically
>> meaningless:
>>
>> dd <- data.frame(a=c(1, 2, 3, 4, 5, 6), b=c(3, 5, 6, 7, 9, 10))
>>
>> overparameterized.model <- glm(b~poly(a,3),data=dd)
>>
>> summary(overparameterized.model)
>>
>>
>> Coefficients:
>> Estimate Std. Error t value Pr(>|t|)
>>
>> (Intercept) 6.6667 0.1725 38.644 0.000669 ***
>>
>> poly(a, 3)1 5.7371 0.4226 13.576 0.005382 **
>>
>> poly(a, 3)2 -0.1091 0.4226 -0.258 0.820395
>>
>> poly(a, 3)3 0.2236 0.4226 0.529 0.649562
>>
>>
>>
>>
>> Charles Annis, P.E.
>>
>> Charles.Annis at StatisticalEngineering.com
>> phone: 561-352-9699
>> eFax: 614-455-3265
>> http://www.StatisticalEngineering.com
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
>>
> On
>
>> Behalf Of Jarek Jasiewicz
>> Sent: Saturday, January 12, 2008 11:50 AM
>> To: R-help at r-project.org
>> Subject: [R] glm expand model to more values
>>
>> Hi
>>
>> I have the problem with fitting curve to data with lm and glm. When I
>> use polynominal dependiency, fitted values from model are OK, but I
>> cannot recive proper values when I use coefficents to caltulate this.
>> Let me present simple example:
>>
>> I have simple data.frame: (dd)
>> a: 1 2 3 4 5 6
>> b: 3 5 6 7 9 10
>>
>> I try to fit it to model:
>>
>> model=glm(b~poly(a,3),data=dd)
>> I have following data fitted to model (as I expected)
>> > fitted(model)
>> 1 2 3 4 5 6
>> 3.095238 4.738095 6.095238 7.333333 8.619048 10.119048
>>
>> and coef(model)
>> (Intercept) poly(a, 3)1 poly(a, 3)2 poly(a, 3)3
>> 6.6666667 5.7370973 -0.1091089 0.2236068
>>
>> so when I try to expand the model to other data (simple extrapolation),
>> let say: s=seq(1:10,by=1)
>>
>> I do:
>> extra=sapply(s,function(x) coef(model) %*% x^(0:3))
>> and here is result:
>> [1] 12.51826 19.49328 28.93336 42.18015 60.57528 85.46040 118.17714
>> [8] 160.06715 212.47207 276.73354
>>
>> the data form expanding coefs are completly differnd from fitted
>>
>> What's going wrong?
>>
>> Jarek
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>>
> http://www.R-project.org/posting-guide.html
>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
> sorry but I cannot understand. What does it means data are statistically
> meanningless?
>
> It is examle with very simple data which I use according to simpleR
> manual example to check why I cannot recive expected result. I need
> simple model y~x^3+x^2....+z to extrapolate data
> Jarek
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
I understand that data are not well example. But I try to find rather
general solution.
Original data are list 98 dataframes and are calculated by over 100
lines R script I thought that it is too much to attach them, so I typed
few digits to ilustrate problem.
The question was asked wrong. It shoud be:
if formulas:
pol3_model=lm(b~poly(a,3))
p3_model=lm(b~a+I(a^2)+I(a^3))
are the same? according R documetation - Yes
both gives the same fitted() values, but completly different coef()
More information about the R-help
mailing list