[R] How shall one present LRT test statistic in a scientific journal ?

Thu Nov 26 19:25:54 CET 2009

On Nov 26, 2009, at 12:46 PM, Peter Dalgaard wrote:

> David Winsemius wrote:
>>
>> On Nov 26, 2009, at 12:14 PM, JVezilier wrote:
>>
>>>
>>> Hello !!
>>>
>>> I'm recently having a debate with my PhD supervisor regarding how to
>>> write
>>> the result of a likelihood ratio test in an article I'm about to  
>>> submit.
>>>
>>> I analysed my data using "lme" mixed modelling.
>>>
>>> To get some p-values for my fixed effect I used model simplification
>>> and the
>>> typical output R gives looks like this:
>>>
>>> model2 = update ( model1,~.-factor A)
>>> anova (model1, model2)
>>>
>>>      Model df       AIC             BIC         logLik         Test
>>> L.Ratio     p-value
>>> model 1     1 26  -78.73898   15.29707     65.36949
>>> model 2     2 20  -73.70539   -1.36997     56.85270   1 vs 2     
>>> 17.03359
>>> 0.0092
>>>
>>> I thought about presenting it very simply copying/pasting R table  
>>> and
>>> writing it like: "factor A had a significant effect on the response
>>> variable
>>> (Likelihood ratio test, L-ratio = 17.033, p = 0.0092)"
>>>
>>> But my boss argued that it's too unusual (at least in our field of
>>> evolutionary biology) and that I should present instead the LR  
>>> statistic
>>> together with the corresponding Chi^2 statistic since the likelihood
>>> ratio
>>> is almost distributed like a Chi2 (df1-df2), and then write down the
>>> p-value
>>> corresponding to this value of Chi.
>>>
>>> I looked up in the current litterature but cannot really find a  
>>> proper
>>> answer to that dilmena.
>>>
>>> So, dear evolutionary biologists R users, how would you present it ?
>>
>> I am not an evolutionary biologist, but presumably your supervisor is
>> one. Why are you picking a fight not only with him but with your
>> prospective audience when there is no meaningful difference? Here  
>> is the
>> p-value you would get with his method:
>>
>>>> 1-pchisq( 2*(65.36949 -  56.85270), df=6)
>> [1] 0.009160622
>>
>
> As I understood the question, it *is* purely formalistic. I.e., what  
> to
> write, not what to do.
>
> I'd say "L-ratio" is plain wrong, since this is not a ratio, but the  
> log
> of a ratio. "-2lnQ" or "-2logQ" is what my old teachers would write,  
> but
> pragmatically, I'd expect the best chances with editors and  
> reviewers to
> be "LRT: chi-square=17.03, df=6, p=0.092", possibly with LRT spelled
> out. (Some journals like to have the df because it allows reviewers to
> catch glaring mistakes like categorical variables treated as numeric.)

I wonder about the phrase "used model simplification". Wouldn't that  
raise a question about the proper degrees of freedom to use? If terms  
were dropped from the model based simply on the basis of "non- 
significance" shouldn't there be some appropriate penalization of  
subsequent tests of significance?

-- 
David.

>
> -- 
>   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45)  
> 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45)  
> 35327907
>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT