Packing and unpacking preserve the length of a data frame, changing its
width. pack()
makes df
narrow by collapsing a set of columns into a
single df-column. unpack()
makes data
wider by expanding df-columns
back out into individual columns.
pack(.data, ..., .names_sep = NULL) unpack(data, cols, names_sep = NULL, names_repair = "check_unique")
... | < |
---|---|
data, .data | A data frame. |
cols | < |
names_sep, .names_sep | If If a string, the inner and outer names will be used together. In |
names_repair | Used to check that output data frame has valid names. Must be one of the following options:
See |
Generally, unpacking is more useful than packing because it simplifies a complex data structure. Currently, few functions work with df-cols, and they are mostly a curiosity, but seem worth exploring further because they mimic the nested column headers that are so popular in Excel.
# Packing ============================================================= # It's not currently clear why you would ever want to pack columns # since few functions work with this sort of data. df <- tibble(x1 = 1:3, x2 = 4:6, x3 = 7:9, y = 1:3) df#> # A tibble: 3 × 4 #> x1 x2 x3 y #> <int> <int> <int> <int> #> 1 1 4 7 1 #> 2 2 5 8 2 #> 3 3 6 9 3#> # A tibble: 3 × 2 #> y x$x1 $x2 $x3 #> <int> <int> <int> <int> #> 1 1 1 4 7 #> 2 2 2 5 8 #> 3 3 3 6 9#> # A tibble: 3 × 2 #> x$x1 $x2 $x3 y$y #> <int> <int> <int> <int> #> 1 1 4 7 1 #> 2 2 5 8 2 #> 3 3 6 9 3# .names_sep allows you to strip off common prefixes; this # acts as a natural inverse to name_sep in unpack() iris %>% as_tibble() %>% pack( Sepal = starts_with("Sepal"), Petal = starts_with("Petal"), .names_sep = "." )#> # A tibble: 150 × 3 #> Species Sepal$Length $Width Petal$Length $Width #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.1 3.5 1.4 0.2 #> 2 setosa 4.9 3 1.4 0.2 #> 3 setosa 4.7 3.2 1.3 0.2 #> 4 setosa 4.6 3.1 1.5 0.2 #> 5 setosa 5 3.6 1.4 0.2 #> 6 setosa 5.4 3.9 1.7 0.4 #> 7 setosa 4.6 3.4 1.4 0.3 #> 8 setosa 5 3.4 1.5 0.2 #> 9 setosa 4.4 2.9 1.4 0.2 #> 10 setosa 4.9 3.1 1.5 0.1 #> # … with 140 more rows# Unpacking =========================================================== df <- tibble( x = 1:3, y = tibble(a = 1:3, b = 3:1), z = tibble(X = c("a", "b", "c"), Y = runif(3), Z = c(TRUE, FALSE, NA)) ) df#> # A tibble: 3 × 3 #> x y$a $b z$X $Y $Z #> <int> <int> <int> <chr> <dbl> <lgl> #> 1 1 1 3 a 0.0281 TRUE #> 2 2 2 2 b 0.466 FALSE #> 3 3 3 1 c 0.390 NAdf %>% unpack(y)#> # A tibble: 3 × 4 #> x a b z$X $Y $Z #> <int> <int> <int> <chr> <dbl> <lgl> #> 1 1 1 3 a 0.0281 TRUE #> 2 2 2 2 b 0.466 FALSE #> 3 3 3 1 c 0.390 NA#> # A tibble: 3 × 6 #> x a b X Y Z #> <int> <int> <int> <chr> <dbl> <lgl> #> 1 1 1 3 a 0.0281 TRUE #> 2 2 2 2 b 0.466 FALSE #> 3 3 3 1 c 0.390 NA#> # A tibble: 3 × 6 #> x y_a y_b z_X z_Y z_Z #> <int> <int> <int> <chr> <dbl> <lgl> #> 1 1 1 3 a 0.0281 TRUE #> 2 2 2 2 b 0.466 FALSE #> 3 3 3 1 c 0.390 NA