[R] Descriptive Statistics: useful hacks

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Sun Oct 3 00:31:33 CEST 2021


If you think what you are doing is useful, why do you not put it in a
package?! That is, after all, the whole purpose of packages.

I can only speak for myself, of course, but I doubt that posting long
involved messages with code here is going to have anything like the
utility of providing a package with carefully written and tested code
and documented functionality. If you have suggestions about how to
improve a *particular* package, a better alternative is probably to
contact the package maintainer.


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Oct 2, 2021 at 3:00 PM Leonard Mada via R-help
<r-help using r-project.org> wrote:
>
> Dear R Users,
>
>
> I have started to compile some useful hacks for the generation of nice
> descriptive statistics. I hope that these functions & hacks are useful
> to the wider R community. I hope that package developers also get some
> inspiration from the code or from these ideas.
>
>
> I have started to review various packages focused on descriptive
> statistics - although I am still at the very beginning.
>
>
> ### Hacks / Code
> - split table headers in 2 rows;
> - split results over 2 rows: view.gtsummary(...);
> - add abbreviations as footnotes: add.abbrev(...);
>
> The results are exported as a web page (using shiny) and can be printed
> as a pdf documented. See the following pdf example:
>
> https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.Example_1.pdf
>
>
> ### Example
> # currently focused on package gtsummary
> library(gtsummary)
> library(xml2)
>
> mtcars %>%
>      # rename2():
>      # - see file Tools.Data.R;
>      # - behaves in most cases the same as dplyr::rename();
>      rename2("HP" = "hp", "Displ" = disp, "Wt (klb)" = "wt", "Rar" =
> drat) %>%
>      # as.factor.df():
>      # - see file Tools.Data.R;
>      # - encode as (ordered) factor;
>      as.factor.df("cyl", "Cyl ") %>%
>      # the Descriptive Statistics:
>      tbl_summary(by = cyl) %>%
>      modify_header(update = header) %>%
>      add_p() %>%
>      add_overall() %>%
>      modify_header(update = header0) %>%
>      # Hack: split long statistics !!!
>      view.gtsummary(view=FALSE, len=8) %>%
>      add.abbrev(
>          c("Displ", "HP", "Rar", "Wt (klb)" = "Wt"),
>          c("Displacement (in^3)", "Gross horsepower", "Rear axle ratio",
>          "Weight (1000 lbs)"));
>
>
> The required functions are on Github:
> https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R
>
>
>
> The functions rename2() & as.factor.df() are only data-helpers and can
> be found also on Github:
> https://github.com/discoleo/R/blob/master/Stat/Tools.Data.R
>
>
> Note:
>
> 1.) The function add.abbrev() operates on the generated html-code:
>
> - the functionality is more generic and could be used easily with other
> packages that export web pages as well;
>
> 2.) Split statistics: is an ugly hack. I plan to redesign the
> functionality using xml-technologies. But I have already too many
> side-projects.
>
> 3.) as.factor.df(): traditionally, one would create derived data-sets or
> add a new column with the variable as factor (as the user may need the
> numeric values for further analysis). But it looked nicer as a single
> block of code.
>
>
> Sincerely,
>
>
> Leonard
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list