sum_up prints detailed summary statistics (corresponds to Stata
<- 100 N <- tibble( df id = 1:N, v1 = sample(5, N, TRUE), v2 = sample(1e6, N, TRUE) )sum_up(df) %>% sum_up(starts_with("v"), d = TRUE) df %>% group_by(v1) %>% sum_up()df
tab prints distinct rows with their count. Compared to the dplyr function
count, this command adds frequency, percent, and cumulative percent.
<- 1e2 ; K = 10 N <- tibble( df id = sample(c(NA,1:5), N/K, TRUE), v1 = sample(1:5, N/K, TRUE) )tab(df, id) tab(df, id, na.rm = TRUE) tab(df, id, v1)
join is a wrapper for dplyr merge functionalities, with two added functions
check checks there are no duplicates in the master or using data.tables (as in Stata).
# merge m:1 v1 join(x, y, kind = "full", check = m~1)
gen specifies the name of a new variable that identifies non matched and matched rows (as in Stata).
# merge m:1 v1, gen(_merge) join(x, y, kind = "full", gen = "_merge")
update allows to update missing values of the master dataset by the value in the using dataset