--- title: "Example 3: Reshape" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Example 3: Reshape} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` When I design `longer_dt` and `wider_dt`, I could find the `pivot_longer` and `pivot_wider` in `tidyr` and `melt` and `dcast` in `data.table`. Still, designing this API is not easy, as my goal is to let users use it with least pain. Here we would try to reproduce the results in the vignette of `tidyr`(). First load the packages: ```{r setup} library(tidyfst) library(tidyr) ``` ## Longer First inspect the data: ```{r} relig_income ``` In `tidyr`, to get the longer format you need: ```{r,eval=FALSE} relig_income %>% pivot_longer(-religion, names_to = "income", values_to = "count") ``` In `tidyfst`, we have: ```{r,warning=FALSE} relig_income %>% longer_dt("religion",name = "income",value = "count") ``` Another example from `tidyr`: ```{r,eval=FALSE} billboard # tidyr way: billboard %>% pivot_longer( cols = starts_with("wk"), names_to = "week", values_to = "rank", values_drop_na = TRUE ) # tidyfst way: billboard %>% longer_dt(-"wk", name = "week", value = "rank", na.rm = TRUE ) # regex could select the groups to keep, and minus could select the reverse ``` A warning would could come out because the merging column has different data types and do the coercion automatically. ## Wider ```{r} ## data fish_encounters ## tidyr way: fish_encounters %>% pivot_wider(names_from = station, values_from = seen) ## tidyfst way: fish_encounters %>% wider_dt(name = "station",value = "seen") # if no keeped groups are selected, use all except for name and value columns ``` If you want to fill with 0s, use: ```{r} fish_encounters %>% wider_dt(name = "station",value = "seen",fill = 0) ``` Note that the parameter of `name` and `value` should always be provided and should be explicit called (with the parameter names attached). ## More complicated example This example comes from data.table (), and has been used in tidyr too. We'll try to do it in tidyfst in this example. If we have a data.frame as below: ```{r} family <- fread("family_id age_mother dob_child1 dob_child2 dob_child3 gender_child1 gender_child2 gender_child3 1 30 1998-11-26 2000-01-29 NA 1 2 NA 2 27 1996-06-22 NA NA 2 NA NA 3 26 2002-07-11 2004-04-05 2007-09-02 2 2 1 4 32 2004-10-10 2009-08-27 2012-07-21 1 1 1 5 29 2000-12-05 2005-02-28 NA 2 1 NA") family ``` And want to reshape the data.table to be like this: ```{r} # family_id age_mother child dob gender # # 1: 1 30 child1 1998-11-26 1 # 2: 1 30 child2 2000-01-29 2 # 3: 1 30 child3 # 4: 2 27 child1 1996-06-22 2 # 5: 2 27 child2 # 6: 2 27 child3 # 7: 3 26 child1 2002-07-11 2 # 8: 3 26 child2 2004-04-05 2 # 9: 3 26 child3 2007-09-02 1 # 10: 4 32 child1 2004-10-10 1 # 11: 4 32 child2 2009-08-27 1 # 12: 4 32 child3 2012-07-21 1 # 13: 5 29 child1 2000-12-05 2 # 14: 5 29 child2 2005-02-28 1 # 15: 5 29 child3 ``` The `data.table::dcast` and `tidyr::pivot_longer` could transfer it in one step, however, not so easy to understand. Here we'll do it step by step to see what actually happens in this transfer. ```{r,warning=FALSE} family %>% longer_dt(1:2) %>% separate_dt("name",into = c("class","child")) %>% wider_dt(-"class|value", name = "class", value = "value") ``` In such a process, we could find that we actually get a longer table, then separate it, and wider it later. *tidyfst* is not going to support the complicated transfer in one step, because it might be easier to implement, but much harder to understand 3 procedures in 1 step. If you still prefer that way, use `data.table::dcast` and `tidyr::pivot_longer` instead.