[R] tests for measures of influence in regression
Frank Tamborello
franklin.tamborello at uth.tmc.edu
Mon Feb 22 16:22:30 CET 2010
Thank you, Wolfgang! Now that I know what the function does I can at
least search some literature to learn about those criteria.
Thanks,
Frank Tamborello
On Feb 22, 2010, at 7:00 AM, Viechtbauer Wolfgang (STAT) wrote:
> I don't think this information can be found in the documentation,
> but you can always just check the actual influence.measures() and
> print.infl() code to find out. Most importantly, influence.measures
> () incldues the following code:
>
> function (model)
> {
> is.influential <- function(infmat, n) {
> k <- ncol(infmat) - 4
> if (n <= k)
> stop("too few cases, n < k")
> absmat <- abs(infmat)
> result <- cbind(absmat[, 1L:k] > 1, absmat[, k + 1] >
> 3 * sqrt(k/(n - k)), abs(1 - infmat[, k + 2]) > (3 *
> k)/(n - k), pf(infmat[, k + 3], k, n - k) > 0.5,
> infmat[, k + 4] > (3 * k)/n)
> dimnames(result) <- dimnames(infmat)
> result
> }
> ...
> infmat <- cbind(dfbetas, dffit = dffits, cov.r = cov.ratio,
> cook.d = cooks.d, hat = h)
> ...
> is.inf <- is.influential(infmat, sum(h > 0))
> ...
> }
>
> So, a case is flagged if:
>
> - any of its absolute dfbetas values are larger than 1, or
> - its absolute dffits value is larger than 3*sqrt(k/(n-k)), or
> - abs(1 - covratio) is larger than 3*k/(n-k), or
> - its Cook's distance is larger than the 50% percentile of
> an F-distributio with k and n-k degrees of freedom, or
> - its hatvalue is larger than 3*k/n,
>
> where k denotes the number of model coefficients (e.g., k = 2 for
> simple regression with the intercept included in the model).
>
> Best,
>
> --
> Wolfgang Viechtbauer http://www.wvbauer.com/
> Department of Methodology and Statistics Tel: +31 (43) 388-2277
> School for Public Health and Primary Care Office Location:
> Maastricht University, P.O. Box 616 Room B2.01 (second floor)
> 6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck)
>
>
> ----Original Message----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Frank Tamborello
> Sent: Monday, February 22, 2010 00:39 To: r-help at r-project.org
> Subject: [R] tests for measures of influence in regression
>
>> influence.measures gives several measures of influence for each
>> observation (Cook's Distance, etc) and actually flags observations
>> that it determines are influential by any of the measures. Looks
>> good! But how does it discriminate between the influential and non-
>> influential observations by each of the measures? Like does it do a
>> Bonferroni-corrected t on the residuals identified by the influence
>> measures or some other test?
>>
>> Cheers,
>>
>> Frank Tamborello, PhD
>> W. M. Keck Postdoctoral Fellow
>> School of Health Information Sciences
>> University of Texas Health Science Center, Houston
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list