[R] R question - combine values

markleeds at verizon.net markleeds at verizon.net
Thu Feb 26 07:14:22 CET 2009


oops, then i guess i should not have sent the recode suggestion. 
choonhong: I only
sent it as an example of how to recode your factor. I didn't mean to 
imply ( nor
did i even give it much thought ) that what you're doing is 
statistically/philosophically
correct.

I'm a friend but I think what David is implying is that you are deciding 
on hypotheses after
looking at results which is kind of cheating and means that you can't 
rely on
any statistical tests that you do going forward because they will be 
biased.

I'm sure he has a good point but I don't want to get into this  since I 
really don't know what you're doing and it's a very complex topic. I 
think Frank's book has a lot to say about this type of thing.






On Thu, Feb 26, 2009 at 12:14 AM, David Winsemius wrote:

> You don't.
>
> And even if you do get someone to tell you how, you may still not 
> legitimately lower your degrees of freedom. Friends don't let friends 
> use stepwise approaches to regression analysis.
>
> -- 
> David Winsemius
>
> On Feb 25, 2009, at 10:33 PM, choonhong ang wrote:
>
>> The district a is the baseline and we observe the difference between
>> District a & b is not significant, we can choose to combine these 2 
>> values.
>> How to write code to combine these 2 value ?
>>
>>> m1=glm(Claims~District+Group+Age+log(Holders),fami 
>>> ly=poisson,data=mydata)
>>> summary(m1)
>>
>> Call:
>> glm(formula = Claims ~ District + Group + Age + log(Holders),
>> family = poisson, data = mydata)
>>
>> Deviance Residuals:
>> Min 1Q Median 3Q Max
>> -2.553115 -0.471819 0.002411 0.455274 1.800739
>>
>> Coefficients:
>> Estimate Std. Error z value Pr(>|z|)
>> (Intercept) -2.777752 0.689162 -4.031 5.56e-05 ***
>> Districtb 0.119942 0.079861 1.502 0.133125
>> Districtc 0.228371 0.144503 1.580 0.114019
>> Districtd 0.571661 0.248792 2.298 0.021576 *
>> Group>2l 0.794721 0.180354 4.406 1.05e-05 ***
>> Group1-1.5l -0.003496 0.127947 -0.027 0.978202
>> Group1.5-2l 0.379190 0.055856 6.789 1.13e-11 ***
>> Age>35 -1.074971 0.389480 -2.760 0.005780 **
>> Age25-29 -0.332131 0.129512 -2.564 0.010333 *
>> Age30-35 -0.539815 0.160138 -3.371 0.000749 ***
>> log(Holders) 1.201696 0.144135 8.337 < 2e-16 ***
>> ---
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>
>> (Dispersion parameter for poisson family taken to be 1)
>>
>> Null deviance: 4236.68 on 63 degrees of freedom
>> Residual deviance: 49.45 on 53 degrees of freedom
>> AIC: 388.77
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list