[R] Insurance data in library(MASS)

Greg Snow Greg.Snow at imail.org
Mon Feb 23 18:41:26 CET 2009


In the Insurance dataset both Age and Group are ordered factors so the default encoding for them is orthogonal polynomials (assuming that the user has not changed the default).  In the output below the .L indicates that line is for the "Linear" piece of the encoding or the Linear contrast on the groups, .Q is for the "Quadratic" piece/contrast and .C is for "Cubic".  If you don't understand what is meant by linear/quadratic/cubic, then do some background reading on orthogonal polynomials.

If you read the data in yourself from a .csv file, then Age and Group will not be ordered factors unless you specifically convert them to be.  Therefore the default encoding will be something other than orthogonal polynomials and the specific details will be different (though the overall effect will be the same).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111


> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of choonhong ang
> Sent: Monday, February 23, 2009 10:05 AM
> To: r-help at r-project.org
> Subject: [R] Insurance data in library(MASS)
> 
> I have used the insurance data from R library and I have 2 questions:
> I use the following:
> >library(MASS)
> >data(Insurance)
> > m1=glm(Claims ~ District + Group + Age + offset(log(Holders)),data =
> Insurance, family = poisson)
> >summary(m1)
> 
> Call:
> glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
>     family = poisson, data = Insurance)
> Deviance Residuals:
>      Min        1Q    Median        3Q       Max
> -2.46558  -0.50802  -0.03198   0.55555   1.94026
> Coefficients:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept) -1.810508   0.032972 -54.910  < 2e-16 ***
> District2    0.025868   0.043016   0.601 0.547597
> District3    0.038524   0.050512   0.763 0.445657
> District4    0.234205   0.061673   3.798 0.000146 ***
> Group.L      0.429708   0.049459   8.688  < 2e-16 ***
> Group.Q      0.004632   0.041988   0.110 0.912150
> Group.C     -0.029294   0.033069  -0.886 0.375696
> Age.L       -0.394432   0.049404  -7.984 1.42e-15 ***
> Age.Q       -0.000355   0.048918  -0.007 0.994210
> Age.C       -0.016737   0.048478  -0.345 0.729910
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> (Dispersion parameter for poisson family taken to be 1)
>     Null deviance: 236.26  on 63  degrees of freedom
> Residual deviance:  51.42  on 54  degrees of freedom
> AIC: 388.74
>  (1) In the result above, what is Group.L, Group.Q, Group.C, Age.L,
> Age.Q,
> Age.C ?
> 
>  (2) When I copy the Insurance data in csv format (as shown in the
> attachement) and run the same procedure the result shown is different
> from
> above result, why ?



More information about the R-help mailing list