[Rd] as.data.frame() methods for model objects
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri Jan 17 17:03:30 CET 2025
>>>>> SOEIRO Thomas via R-devel
>>>>> on Fri, 17 Jan 2025 14:19:31 +0000 writes:
> Following Duncan Murdoch's off-list comments (thanks again!), here is a more complete/flexible version:
>
> as.data.frame.lm <- function(x, ..., level = 0.95, exp = FALSE) {
> cf <- x |> summary() |> stats::coef()
> ci <- stats::confint(x, level = level)
> if (exp) {
> cf[, "Estimate"] <- exp(cf[, "Estimate"])
> ci <- exp(ci)
> }
> df <- data.frame(row.names(cf), cf, ci, row.names = NULL)
> names(df) <- c("term", "estimate", "std.error", "statistic", "p.value", "conf.low", "conf.high")
> df
> }
Indeed, using level is much better already.
Instead of the exp = FALSE ,
I'd use transFUN = NULL
and then
if(!is.null(transFUN)) {
stopifnot(is.function(transFUN))
cf[, "Estimate"] <- transFUN(cf[, "Estimate"])
ci <- transFUN(ci)
}
Noting that I'd want "inverse-logit" (*) in some cases, but also
different things for different link functions, hence just
exp = T/F is not enough.
Martin
--
*) "inverse-logit" is simply R's plogis() function; quite a
few people have been re-inventing it, also in their packages ...
> > lm(breaks ~ wool + tension, warpbreaks) |> as.data.frame()
> term estimate std.error statistic p.value conf.low conf.high
> 1 (Intercept) 39.277778 3.161783 12.422667 6.681866e-17 32.92715 45.6284061
> 2 woolB -5.777778 3.161783 -1.827380 7.361367e-02 -12.12841 0.5728505
> 3 tensionM -10.000000 3.872378 -2.582393 1.278683e-02 -17.77790 -2.2221006
> 4 tensionH -14.722222 3.872378 -3.801856 3.913842e-04 -22.50012 -6.9443228
>
> > glm(breaks < 20 ~ wool + tension, data = warpbreaks) |> as.data.frame(exp = TRUE)
> Waiting for profiling to be done...
> term estimate std.error statistic p.value conf.low conf.high
> 1 (Intercept) 1.076887 0.1226144 0.6041221 0.54849393 0.8468381 1.369429
> 2 woolB 1.076887 0.1226144 0.6041221 0.54849393 0.8468381 1.369429
> 3 tensionM 1.248849 0.1501714 1.4797909 0.14520270 0.9304302 1.676239
> 4 tensionH 1.395612 0.1501714 2.2196863 0.03100435 1.0397735 1.873229
>
> Thank you.
>
> Best regards,
> Thomas
>
>
>
> -----Message d'origine-----
> De : SOEIRO Thomas
> Envoyé : jeudi 16 janvier 2025 14:36
> À : r-devel using r-project.org
> Objet : as.data.frame() methods for model objects
>
> Hello all,
>
> Would there be any interest for adding as.data.frame() methods for model objects?
> Of course there is packages (e.g. broom), but I think providing methods would be more discoverable (and the patch would be small).
> It is really useful for exporting model results or for plotting.
>
> e.g.:
>
> as.data.frame.lm <- function(x) { # could get other arguments, e.g. exp = TRUE/FALSE to exponentiate estimate, conf.low, conf.high
> cf <- x |> summary() |> stats::coef()
> ci <- stats::confint(x)
> data.frame(
> term = row.names(cf),
> estimate = cf[, "Estimate"],
> p.value = cf[, 4], # magic number because name changes between lm() and glm(*, family = *)
> conf.low = ci[, "2.5 %"],
> conf.high = ci[, "97.5 %"],
> row.names = NULL
> )
> }
>
> > lm(breaks ~ wool + tension, warpbreaks) |> as.data.frame()
> term estimate p.value conf.low conf.high
> 1 (Intercept) 39.277778 6.681866e-17 32.92715 45.6284061
> 2 woolB -5.777778 7.361367e-02 -12.12841 0.5728505
> 3 tensionM -10.000000 1.278683e-02 -17.77790 -2.2221006
> 4 tensionH -14.722222 3.913842e-04 -22.50012 -6.9443228
>
> > glm(breaks < 20 ~ wool + tension, data = warpbreaks) |> as.data.frame()
> Waiting for profiling to be done...
> term estimate p.value conf.low conf.high
> 1 (Intercept) 0.07407407 0.54849393 -0.16624575 0.3143939
> 2 woolB 0.07407407 0.54849393 -0.16624575 0.3143939
> 3 tensionM 0.22222222 0.14520270 -0.07210825 0.5165527
> 4 tensionH 0.33333333 0.03100435 0.03900286 0.6276638
>
> Thank you.
>
> Best regards,
> Thomas
More information about the R-devel
mailing list