Create Demographic Table

Introduction

This vignette of package DemographicTable (CRAN, Github) presents an idiot-proof interface to create a summary table of simple statistics, often known as demographic table.

Note to Users

Examples in this vignette require that the search path has

library(DemographicTable)
#> Loading required package: flextable
set_flextable_defaults(font.size = 9)

Users may remove the last pipe |> as_flextable() from all examples. The author of this package are forced to have it in this vignette to make package rmarkdown rendering work.

Demographic Table

Data preparation

tgr = ToothGrowth |> 
  within.data.frame(expr = {
    dose = factor(dose) 
  })

Summary of all subjects

tgr |>
  DemographicTable(include = c('supp', 'len', 'dose')) |> 
  as_flextable()

tgr

n=60

len.
mean±sd
median; IQR
range


18.8±7.6
.
4.2~33.9

dose: n (%).
0.5
1
2


20 (33.3%)
20 (33.3%)
20 (33.3%)

supp: n (%).
OJ
VC


30 (50.0%)
30 (50.0%)

n=60

tgr

Summary by one group

Color of each group is determined by scales::pal_hue(), which is the default color pallete used in package ggplot2.

tgr |>
  DemographicTable(groups = 'supp', include = c('len', 'dose')) |> 
  as_flextable()

tgr

n=60

supp

OJ
n=30 (50.0%)

VC
n=30 (50.0%)

Signif

len.
mean±sd
median; IQR
range


18.8±7.6
.
4.2~33.9


20.7±6.6
22.7; 10.2
8.2~30.9


17.0±8.3
.
4.2~33.9

0.064
Wilcoxon-
Mann-Whitney

dose: n (%).
0.5
1
2


20 (33.3%)
20 (33.3%)
20 (33.3%)


10 (33.3%)
10 (33.3%)
10 (33.3%)


10 (33.3%)
10 (33.3%)
10 (33.3%)

1.000
Fisher's Exact

n=60

supp

tgr

User may choose to hide the \(p\)-values with compare = FALSE.

tgr |>
  DemographicTable(groups = 'supp', include = c('len', 'dose'), compare = FALSE) |> 
  as_flextable()

tgr

n=60

supp

OJ
n=30 (50.0%)

VC
n=30 (50.0%)

len.
mean±sd
median; IQR
range


18.8±7.6
.
4.2~33.9


20.7±6.6
22.7; 10.2
8.2~30.9


17.0±8.3
.
4.2~33.9

dose: n (%).
0.5
1
2


20 (33.3%)
20 (33.3%)
20 (33.3%)


10 (33.3%)
10 (33.3%)
10 (33.3%)


10 (33.3%)
10 (33.3%)
10 (33.3%)

n=60

supp

tgr

Summary by multiple groups

tgr |>
  DemographicTable(groups = c('supp', 'dose'), include = c('len', 'supp')) |>
  as_flextable()

tgr

n=60

supp

dose

OJ
n=30 (50.0%)

VC
n=30 (50.0%)

Signif1

0.5
n=20 (33.3%)

1
n=20 (33.3%)

2
n=20 (33.3%)

Signif2

len.
mean±sd
median; IQR
range


18.8±7.6
.
4.2~33.9


20.7±6.6
22.7; 10.2
8.2~30.9


17.0±8.3
.
4.2~33.9

0.064
Wilcoxon-
Mann-Whitney


10.6±4.5
.
4.2~21.5


19.7±4.4
.
13.6~27.3


26.1±3.8
.
18.5~33.9

★ 0.000; ⸢1⸥ vs. ⸢0.5⸥
★ 0.000; ⸢2⸥ vs. ⸢0.5⸥
★ 0.000; ⸢2⸥ vs. ⸢1⸥
Pairwise Two-Sample t

supp: n (%).
OJ
VC


30 (50.0%)
30 (50.0%)


30 (100.0%)
-


-
30 (100.0%)

★ 0.000
Fisher's Exact


10 (50.0%)
10 (50.0%)


10 (50.0%)
10 (50.0%)


10 (50.0%)
10 (50.0%)

1.000
Fisher's Exact

n=60

supp

dose

tgr

Contatenate multiple DemographicTables

tb1 = CO2 |>
  DemographicTable(groups = 'Type', include = c('conc', 'uptake'))
tb2 = CO2 |>
  subset(subset = (Treatment == 'nonchilled')) |> 
  DemographicTable(groups = 'Type', include = c('conc', 'uptake'), data.name = 'CO2_nonchilled')
c(tb1, tb2) |> as_flextable()

CO2

CO2_nonchilled

n=84

Type

n=42

Type

Quebec
n=42 (50.0%)

Mississippi
n=42 (50.0%)

Signif1

Quebec
n=21 (50.0%)

Mississippi
n=21 (50.0%)

Signif2

conc.
mean±sd
median; IQR
range


435.0±295.9
350.0; 500.0
95.0~1000.0


435.0±297.7
350.0; 500.0
95.0~1000.0


435.0±297.7
350.0; 500.0
95.0~1000.0

1.000
Two-Sample t


435.0±297.7
350.0; 500.0
95.0~1000.0


435.0±301.4
350.0; 500.0
95.0~1000.0


435.0±301.4
350.0; 500.0
95.0~1000.0

1.000
Wilcoxon-
Mann-Whitney

uptake.
mean±sd
median; IQR
range


27.2±10.8
28.3; 19.2
7.7~45.5


33.5±9.7
37.2; 9.8
9.3~45.5


20.9±7.8
19.3; 14.2
7.7~35.5

0.000★
Two-Sample t


30.6±9.7
31.3; 12.2
10.6~45.5


35.3±9.6
39.2; 9.4
13.6~45.5


26.0±7.4
28.1; 9.1
10.6~35.5

0.000★
Wilcoxon-
Mann-Whitney

n=84

Type

n=42

Type

CO2

CO2_nonchilled

Exception Handling

Missing value in groups

MASS::survey |>
  DemographicTable(groups = c('M.I'), include = c('Pulse', 'Fold')) |>
  as_flextable()

MASS::survey

n=237

M.I
n=28 (11.8%) missing

Imperial
n=68 (28.7%)

Metric
n=141 (59.5%)

Signif

Pulse.
mean±sd
median; IQR
range

n*=192
74.2±11.7
.
35.0~104.0

n*=59
73.9±10.2
72.0; 12.0
40.0~104.0

n*=112
73.9±12.2
.
35.0~104.0

0.974
Two-Sample t

Fold: n (%).
L on R
Neither
R on L


99 (41.8%)
18 (7.6%)
120 (50.6%)


32 (47.1%)
3 (4.4%)
33 (48.5%)


59 (41.8%)
12 (8.5%)
70 (49.6%)

0.532
Fisher's Exact

n=237

M.I
n=28 (11.8%) missing

MASS::survey

Use of logical values

Use of logical values is discouraged, as this practice is proved confusing to scientists without a strong data background. A warning message will be printed.

#> Some scientists do not understand logical value (e.g., arm_intervention being TRUE/FALSE)
#> Consider using 2-level factor (e.g., arm being intervention/control)
mtc = mtcars |>
  within.data.frame(expr = {
    vs = as.logical(vs)
    am = as.logical(am)
  })
tryCatch(DemographicTable(mtc, groups = 'am', include = c('hp', 'drat')), warning = identity)
#> <simpleWarning in DemographicTable(mtc, groups = "am", include = c("hp", "drat")): Some scientists do not understand logical value (e.g., arm_intervention being TRUE/FALSE)
#> Consider using 2-level factor (e.g., arm being intervention/control)>
tryCatch(DemographicTable(mtc, groups = 'cyl', include = c('vs')), warning = identity)
#> <simpleWarning in FUN(X[[i]], ...): Some scientists do not understand logical value (e.g., arm_intervention being TRUE/FALSE)
#> Consider using 2-level factor (e.g., arm being intervention/control)>