[R-sig-ME] Trouble Replicating Unstructured Mixed Procedure in R
John Maindonald
john.maindonald at anu.edu.au
Thu Jan 26 23:37:21 CET 2012
It is not really a matter of computational accuracy. One can get highly
accurate values for an inappropriate statistic.
Or if there is insistence on using the word, accuracy, what is the
meaning?
i) the wrong formula is used? Then in what sense is it 'wrong'?
ii) there is a numerical inaccuracy in the calculation? This is almost
never an issue in a relatively simple calculation such as this, given
the care taken by the code writers in such matters.
iii) where an approximation is used, as in using an F-distribution
approximation, is the best choice of degrees of freedom made to
for use of this approximation? I judge that the degrees of freedom
for lme's F-statistic for the interaction are not well chosen. Users
really have to sort this out for themselves, rather than relying on
what may be a fairly wild approximation that appears in lm's
output. Using 75df rather than 25df does not however make the
difference that a choice between (e.g.) 5df and 25df would.
A further and more basic issue is whether the statistic that is
provided is appropriate to the intended generalisation. I'd take
this to be generalisation to another sample of youths from the
same population. In order to understand why R and SAS are
giving different F-statistics for the interaction, one needs to
understand just what variance-covariance structure is assumed
in each case. One might extract the two estimates of the
var-cov structure and compare them. Look for terms in one that
do not appear, or maybe that are zero, in the other.
Finally, it is not just that Venables does not like type III SS.
He is saying that they almost never correspond to a null
hypothesis that makes any sense. Those who disagree try to
write down the model to which the null hypothesis corresponds
in testing for the main effect of factor1 with a factor1:factor2
interaction.
John Maindonald email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473 fax : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm
On 27/01/2012, at 2:41 AM, Thompson,Paul wrote:
> OK, I've looked at that reference.
>
> There are 2 aspects of an estimate like a SS. The first is the stability of the estimate, and the second is the interpretation of the estimate. The issues with the interpretation of the different estimates go back to 1970, and they are simply a matter of interpretation. The point of the Venables discussion is that he does not like Type III SS, not that they are wrong. He does not agree with the interpretation.
>
> The issue here is the accuracy of the Type III or Type I or Type II or whatever. Accuracy comes before interpretation. If the r module and SAS do not arrive at the same estimates, that is an important thing.
>
> Once we agree upon computation, we can argue about interpretation. Charles Determan is inquiring as to computational accuracy. The use and interpretation of the various Type I, II, III, IV, LVX SS are secondary.
>
> -----Original Message-----
> From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Luca Borger
> Sent: Thursday, January 26, 2012 9:03 AM
> To: r-sig-mixed-models at r-project.org
> Subject: Re: [R-sig-ME] Trouble Replicating Unstructured Mixed Procedure in R
>
> I think:
>
> http://www.stats.ox.ac.uk/pub/MASS3/Exegeses.pdf
>
> HTH
> Luca
>
>
>
> Le 26/01/2012 15:52, Thompson,Paul a écrit :
>> I am unfamiliar with this critique of Type III SS. Can you point me to a reference discussing the difficulties with Type III SS?
>>
>> -----Original Message-----
>> From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of John Maindonald
>> Sent: Wednesday, January 25, 2012 11:19 PM
>> To: David Duffy
>> Cc: r-sig-mixed-models at r-project.org
>> Subject: Re: [R-sig-ME] Trouble Replicating Unstructured Mixed Procedure in R
>>
>> It is well to note that type III sums of squares are problematic.
>> For testing the effects of a main effect, the null model is constraining
>> the main effect in a manner that depends on the parameterisation.
>>
>> There are situations where it makes sense to fit interactions without
>> main effects, and it is clear what constraint on the main effect is the
>> relevant null (with an interaction between a factor and a variable,
>> does one want all lines to go though the same point, or through
>> perhaps the origin?), but that situation is unusual. For lines that
>> are separate or all through the one point, one does not need
>> type III sums of squares.
>>
>> Analyses often or frequently have enough genuine complications
>> worrying (unless it is blindingly obvious that one ought to worry
>> about it) without the rarely relevant complication of attending to a
>> type III sum of squares.
>>
>> I'd guess that SAS and lme are, effectively, making different
>> assumptions about the intended generalisation. They are
>> clearly using different denominator degrees of freedom for F.
>> As one is looking for consistency across the 27 different youths,
>> SAS's denominator degrees of freedom for the interaction seem
>> more or less right, pretty much equivalent to calculating slopes
>> for females and slopes for males and using a t-test to compare
>> them. (Sure, in the analyses presented, age has been treated
>> as a categorical variable, but the comment still applies.)
>>
>> John Maindonald email: john.maindonald at anu.edu.au
>> phone : +61 2 (6125)3473 fax : +61 2(6125)5549
>> Centre for Mathematics& Its Applications, Room 1194,
>> John Dedman Mathematical Sciences Building (Building 27)
>> Australian National University, Canberra ACT 0200.
>> http://www.maths.anu.edu.au/~johnm
>>
>> On 26/01/2012, at 1:54 PM, David Duffy wrote:
>>
>>> On Tue, 24 Jan 2012, Charles Determan Jr wrote:
>>>
>>>> Greetings,
>>>>
>>>> I have been working on R for some time now and I have begun the endeavor of
>>>> trying to replicate some SAS code in R. I have scoured the forums but
>>>>
>>> This is also the Orthodont dataset, distributed with nlme.
>>>
>>> As David Atkins pointed out, R defaults to Type I SS. so you would need to use, for example, the Anova() command from the car package. The other thing is that the SAS F statistics are only approximate, depending on which covariance structure is chosen (perhaps John Maindonald or someone clever could comment), so SAS offers different possibilities for ddf eg
>>>
>>> http://www2.sas.com/proceedings/sugi26/p262-26.pdf
>>>
>>> while lme and lmer offer one or none.
>>>
>>> --
>>> | David Duffy (MBBS PhD) ,-_|\
>>> | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / *
>>> | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/
>>> | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>> -----------------------------------------------------------------------
>> Confidentiality Notice: This e-mail message, including any attachments,
>> is for the sole use of the intended recipient(s) and may contain
>> privileged and confidential information. Any unauthorized review, use,
>> disclosure or distribution is prohibited. If you are not the intended
>> recipient, please contact the sender by reply e-mail and destroy
>> all copies of the original message.
>>
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> -----------------------------------------------------------------------
> Confidentiality Notice: This e-mail message, including any attachments,
> is for the sole use of the intended recipient(s) and may contain
> privileged and confidential information. Any unauthorized review, use,
> disclosure or distribution is prohibited. If you are not the intended
> recipient, please contact the sender by reply e-mail and destroy
> all copies of the original message.
>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list