[R] Off topic - Differences among stat packages in GLM results
Thomas Lumley
tlumley at u.washington.edu
Mon Oct 6 16:02:32 CEST 2003
On Mon, 6 Oct 2003, [iso-8859-1] Yves Claveau wrote:
> Dear colleagues,
> I have performed the same analysis using the GLM
> module of three statistical softwares: SYSTAT 10, JMP
> 4.0.2 and R 1.6.2 (see below for more details).
> Although SYSTAT and R give roughly the same level of
> significance for all variables, JMP yield a 20 percent
> difference in probability for a categorical variable.
> In fact, this difference is so important that I can
> call this variable significant. Incidentally, Tukey's
> test is in accordance with this result. Which
> statistical software should I believe?
It looks at though you have asked for three different analyses from the
three packages. Certainly the analysis you asked R for is not the same as
the others.
If you run the anova() function on your model in R you should get one of
the other two analyses. I think Systat gives the things SAS calls Type II
sums of squares, in which case JMP is presumably giving real sums of
squares and will agree with anova().
=thomas
> Thank you in advance for your insight.
>
> Yves Claveau
>
>
>
> DETAILS ON PERFORMED STATISTICAL ANALYSES
>
> The categorical variable I am writing about is ESP
>
> The model used is:
>
> ptro=CONSTANT+classl+ht+esp+classl*ht+classl*esp+ht*esp+classl*ht*esp
>
> Where:
> - ptro is the dependent variable
> - CONSTANT the constant in the model (defaut
> procedure)
> - classl a categorical variable with two classes
> - ht a continuous variable
> - esp a categorical variable with two classes
>
>
> The results for each package are:
>
> R 1.6.2
>
> Call:
> glm(formula = PTRO ~ ESP. * HT * CLASSL., family =
> gaussian,
> data = dataa)
>
> Deviance Residuals:
> Min 1Q Median 3Q Max
>
> -20.21973 -4.41060 -0.03971 4.77046 14.29097
>
>
> Coefficients:
> Estimate Std. Error t value
> Pr(>|t|)
> (Intercept) 35.54604 4.65265 7.640 3.41e-09
> ***
> ESP -13.12051 12.32455 -1.065 0.294
>
> HT 0.08005 0.04374 1.830 0.075 .
> CLASSL 1.09480 5.54809 0.197 0.845
> ESP:HT 0.01694 0.12375 0.137 0.892
> ESP:CLASSL 5.89693 15.41378 0.383 0.704
> HT:CLASSL -0.01952 0.04682 -0.417 0.679
>
> ESP:HT:CLASSL -0.05547 0.13217 -0.420 0.677
>
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.'
> 0.1 ` ' 1
>
> (Dispersion parameter for gaussian family taken to be
> 59.17901)
>
> Null deviance: 4567.3 on 45 degrees of freedom
> Residual deviance: 2248.8 on 38 degrees of freedom
> AIC: 327.46
>
> Number of Fisher Scoring iterations: 2
>
>
> SYSTAT 10
>
> Dep Var: PTRO N: 49 Multiple R: 0.7241 Squared
> multiple R: 0.5244
>
> Analysis of Variance
> Source Sum-of-Squares df Mean-Square F-ratio P
>
> ESP 113.6878 1 113.6878 1.6551
> 0.2055
> CLASSL 20.6118 1 20.6118 0.3001
> 0.5868
> HT 239.7713 1 239.7713 3.4908
> 0.0689
> CLASSL*HT 26.3909 1 26.3909 0.3842
> 0.5388
> CLASSL*ESP 5.9755 1 5.9755 0.0870 0.7695
> ESP*HT 2.6415 1 2.6415 0.0385 0.8455
> CLASSL*ESP*HT 12.9459 1 12.9459 0.1885
> 0.6665
>
> Error 2816.1893 41 68.6875
>
>
> JMP 4
>
> RSquare 0.52438
> RSquare Adj 0.443177
> Root Mean Square Error 8.287795
> Mean of Response 42.78898
> Observations (or Sum Wgts) 49
>
> Analysis of Variance
> Source DF Sum of Squares Mean Square F Ratio
> Model 7 3104.9018 443.557 6.4576
> Error 41 2816.1893 68.688 Prob > F
> C. Total 48 5921.0910 <.0001
>
> Effect Tests
> Source Nparm DF Sum of Squares F Ratio Prob > F
>
> ESP 1 1 636.09249 9.2607 0.0041
> CLASSL 1 1 8.26185 0.1203 0.7305
> HT 1 1 239.77125 3.4908 0.0689
> HT*CLASSL 1 1 26.39087 0.3842 0.5388
> ESP*CLASSL 1 1 12.18491 0.1774 0.6758
> ESP*HT 1 1 2.64154 0.0385 0.8455
> ESP*HT*CLASSL 1 1 12.94593 0.1885 0.6665
>
>
>
>
> __________________________________________________________
> Lèche-vitrine ou lèche-écran ?
> magasinage.yahoo.ca
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list