[R] tests for measures of influence in regression

Frank Tamborello franklin.tamborello at uth.tmc.edu
Mon Feb 22 16:22:30 CET 2010


Thank you, Wolfgang! Now that I know what the function does I can at  
least search some literature to learn about those criteria.

Thanks,
Frank Tamborello

On Feb 22, 2010, at 7:00 AM, Viechtbauer Wolfgang (STAT) wrote:

> I don't think this information can be found in the documentation,  
> but you can always just check the actual influence.measures() and  
> print.infl() code to find out. Most importantly, influence.measures 
> () incldues the following code:
>
> function (model)
> {
>     is.influential <- function(infmat, n) {
>         k <- ncol(infmat) - 4
>         if (n <= k)
>             stop("too few cases, n < k")
>         absmat <- abs(infmat)
>         result <- cbind(absmat[, 1L:k] > 1, absmat[, k + 1] >
>             3 * sqrt(k/(n - k)), abs(1 - infmat[, k + 2]) > (3 *
>             k)/(n - k), pf(infmat[, k + 3], k, n - k) > 0.5,
>             infmat[, k + 4] > (3 * k)/n)
>         dimnames(result) <- dimnames(infmat)
>         result
>     }
>     ...
>     infmat <- cbind(dfbetas, dffit = dffits, cov.r = cov.ratio,
>         cook.d = cooks.d, hat = h)
>     ...
>     is.inf <- is.influential(infmat, sum(h > 0))
>     ...
> }
>
> So, a case is flagged if:
>
> - any of its absolute dfbetas values are larger than 1, or
> - its absolute dffits value is larger than 3*sqrt(k/(n-k)), or
> - abs(1 - covratio) is larger than 3*k/(n-k), or
> - its Cook's distance is larger than the 50% percentile of
>   an F-distributio with k and n-k degrees of freedom, or
> - its hatvalue is larger than 3*k/n,
>
> where k denotes the number of model coefficients (e.g., k = 2 for  
> simple regression with the intercept included in the model).
>
> Best,
>
> --
> Wolfgang Viechtbauer                        http://www.wvbauer.com/
> Department of Methodology and Statistics    Tel: +31 (43) 388-2277
> School for Public Health and Primary Care   Office Location:
> Maastricht University, P.O. Box 616         Room B2.01 (second floor)
> 6200 MD Maastricht, The Netherlands         Debyeplein 1 (Randwyck)
>
>
> ----Original Message----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Frank Tamborello
> Sent: Monday, February 22, 2010 00:39 To: r-help at r-project.org
> Subject: [R] tests for measures of influence in regression
>
>> influence.measures gives several measures of influence for each
>> observation (Cook's Distance, etc) and actually flags observations
>> that it determines are influential by any of the measures. Looks
>> good! But how does it discriminate between the influential and non-
>> influential observations by each of the measures? Like does it do a
>> Bonferroni-corrected t on the residuals identified by the influence
>> measures or some other test?
>>
>> Cheers,
>>
>> Frank Tamborello, PhD
>> W. M. Keck Postdoctoral Fellow
>> School of Health Information Sciences
>> University of Texas Health Science Center, Houston
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list