[RsR] Several bugs for th lmRob function

Mon Dec 14 16:27:18 CET 2009

Dear Olivier,

First of all thank you very much for the patch. I'll take a look at including it as soon as I can. Also, there is a function aovRob in the funs* directory of the robust package. It hasn't been "ported" from S-Plus yet but that really shouldn't too difficult. I'll let you know when I've made some progress.

Cheers,
Kjell

*you will probably need to get the source code via svn from R-Forge.

On 8 déc. 2009, at 13:56, Olivier Renaud wrote:

> Dear all,
> Today, I send two mails related to robust ANOVA in R, but since the scope and target are different, it is better to separate the subjects. In the next mail, there is a praise to improve the accessibility of robust methods for ANOVA. As far as I can understand, roust ANOVA cannot be run with lmrob from the robustbase library, see the other mail. In this mail, I list several bugs in lmRob from the robust library. 
> 
> 	• There is a bug in the function "lmRob": when formulas with interaction between factors are given, only the main effects are considered as factors, but the interaction is considered as a continuous variable. This is problematic, since the initial estimator is very likely give an error because of the discrete type of the interaction. Example:
> > summary(lmRob (formula = Resp~Origine*Sexe, data = POVm3))
> Error in psi.weight(wi[wcnd], ipsi, xk) : 
>   NA/NaN/Inf in foreign function call (arg 2)
> In addition: Warning message:
> In lmRob.fit.compute(x2, y, x1 = x1, x1.idx = x1.idx, nrep = nrep,  :
>   Initial scale less than tl =  1e-06 .
> 
> Attached is a fix to the function lmRob, see the two lines with "### added". I'm not a guru programmer so you might have to check the code. With this corrected function, the above call works.
> 
> 	• It is not a bug, but there are more initial algorithms than initial.alg in lmRob.control let the user choose. More importantly, depending on the type of covariates, the user's choice can be silently overridden (in the function lmRob.fit.compute). I do not criticize this, but I suggest that (a) a message is displayed to inform the user of this modification and (b) that the initial algorithm that is finally used be written the lmRob object. For the moment, the user has no way to know which initial algorithm is used.
> 
> 	• There is a bug in the function "anova.lmRob": in the presence of missing values in the covariates, the models that are compared do not contain the same number of observations, which drives the inference completely wrong. Again, I am not a specialist, but I think it comes from the line
>                curobj <- update(object, curfrm)    
> which uses the original data frame instead of using only the observations with no missing in the FULL model (which are given in the lm object obj$model). When smaller models are compared, typically some observations that were not used in the full model are used. I'm not sure how to fix this bug.
> 
> 	• Maybe related to this missing value problem, I have found an F value equal to -133, although the corresponding t-value in summary was close to zero, but not suspect.
> > anova(lmRob(ttf~income*sexe,data=cig,na.action=na.omit))
> 
> Terms added sequentially (first to last)
> 
>             Chisq Df  RobustF     Pr(F)    
> (Intercept)        1                       
> income             1   69.576 < 2.2e-16 ***
> sexe               1   14.989  7.97e-05 ***
> income:sexe        1 -133.293         1    
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> Warning messages:
> 1: In chi.weight(res/Scale, ipsi, yc) - chi.weight(Res/Scale, ipsi,  :
>   longer object length is not a multiple of shorter object length
> 2: In chi.weight(res/Scale, ipsi, yc) - chi.weight(Res/Scale, ipsi,  :
>   longer object length is not a multiple of shorter object length
> 
> 	• Some prominent members of the S/R community will not consider this as a bug, but in a companion mail, I give arguments to include so-called "Type III sums of squares" or effect tested marginally. In the context of unbalanced ANOVA, there are other prominent members of the statistics community that give extremely convincing arguments, see other mail. I know it can be done by hand, but for an average user, having it as an optional argument to anova.lmRob would be an important argument to use R and robust ANOVA.
>  Cheers,
> Olivier
> 
> -- 
> !!! New e-mail, please update your address book !!!
> 
> Olivier.Renaud using unige.ch               http://www.unige.ch/fapse/mad/
> 
> Methodology & Data Analysis - Psychology Dept - University of Geneva
> UniMail, Office 4164  -  40, Bd du Pont d'Arve   -  CH-1211 Geneva 4
> 
> <lmRobMod210.r>_______________________________________________
> R-SIG-Robust using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-robust