across()
makes it easy to apply the same transformation to multiple
columns, allowing you to use select()
semantics inside in "data-masking"
functions like summarise()
and mutate()
. See vignette("colwise")
for
more details.
if_any()
and if_all()
apply the same
predicate function to a selection of columns and combine the
results into a single logical vector: if_any()
is TRUE
when
the predicate is TRUE
for any of the selected columns, if_all()
is TRUE
when the predicate is TRUE
for all selected columns.
across()
supersedes the family of "scoped variants" like
summarise_at()
, summarise_if()
, and summarise_all()
.
across(.cols = everything(), .fns = NULL, ..., .names = NULL) if_any(.cols = everything(), .fns = NULL, ..., .names = NULL) if_all(.cols = everything(), .fns = NULL, ..., .names = NULL)
.fns | Functions to apply to each of the selected columns. Possible values are:
Within these functions you can use |
---|---|
... | Additional arguments for the function calls in |
.names | A glue specification that describes how to name the output
columns. This can use |
cols, .cols | < |
across()
returns a tibble with one column for each column in .cols
and each function in .fns
.
if_any()
and if_all()
return a logical vector.
R code in dplyr verbs is generally evaluated once per group.
Inside across()
however, code is evaluated once for each
combination of columns and groups. If the evaluation timing is
important, for example if you're generating random variables, think
about when it should happen and place your code in consequence.
gdf <- tibble(g = c(1, 1, 2, 3), v1 = 10:13, v2 = 20:23) %>% group_by(g) set.seed(1) # Outside: 1 normal variate n <- rnorm(1) gdf %>% mutate(across(v1:v2, ~ .x + n))
## # A tibble: 4 x 3 ## # Groups: g [3] ## g v1 v2 ## <dbl> <dbl> <dbl> ## 1 1 9.37 19.4 ## 2 1 10.4 20.4 ## 3 2 11.4 21.4 ## 4 3 12.4 22.4
## # A tibble: 4 x 4 ## # Groups: g [3] ## g v1 v2 n ## <dbl> <dbl> <dbl> <dbl> ## 1 1 10.2 20.2 0.184 ## 2 1 11.2 21.2 0.184 ## 3 2 11.2 21.2 -0.836 ## 4 3 14.6 24.6 1.60
# Inside `across()`: 6 normal variates (ncol * ngroup) gdf %>% mutate(across(v1:v2, ~ .x + rnorm(1)))
## # A tibble: 4 x 3 ## # Groups: g [3] ## g v1 v2 ## <dbl> <dbl> <dbl> ## 1 1 10.3 20.7 ## 2 1 11.3 21.7 ## 3 2 11.2 22.6 ## 4 3 13.5 22.7
c_across()
for a function that returns a vector
# across() ----------------------------------------------------------------- # Different ways to select the same set of columns # See <https://tidyselect.r-lib.org/articles/syntax.html> for details iris %>% as_tibble() %>% mutate(across(c(Sepal.Length, Sepal.Width), round))#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 5 4 1.4 0.2 setosa #> 2 5 3 1.4 0.2 setosa #> 3 5 3 1.3 0.2 setosa #> 4 5 3 1.5 0.2 setosa #> 5 5 4 1.4 0.2 setosa #> 6 5 4 1.7 0.4 setosa #> 7 5 3 1.4 0.3 setosa #> 8 5 3 1.5 0.2 setosa #> 9 4 3 1.4 0.2 setosa #> 10 5 3 1.5 0.1 setosa #> # … with 140 more rows#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 5 4 1.4 0.2 setosa #> 2 5 3 1.4 0.2 setosa #> 3 5 3 1.3 0.2 setosa #> 4 5 3 1.5 0.2 setosa #> 5 5 4 1.4 0.2 setosa #> 6 5 4 1.7 0.4 setosa #> 7 5 3 1.4 0.3 setosa #> 8 5 3 1.5 0.2 setosa #> 9 4 3 1.4 0.2 setosa #> 10 5 3 1.5 0.1 setosa #> # … with 140 more rows#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 5 4 1.4 0.2 setosa #> 2 5 3 1.4 0.2 setosa #> 3 5 3 1.3 0.2 setosa #> 4 5 3 1.5 0.2 setosa #> 5 5 4 1.4 0.2 setosa #> 6 5 4 1.7 0.4 setosa #> 7 5 3 1.4 0.3 setosa #> 8 5 3 1.5 0.2 setosa #> 9 4 3 1.4 0.2 setosa #> 10 5 3 1.5 0.1 setosa #> # … with 140 more rows#> # A tibble: 150 x 5 #> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> <dbl> <dbl> <dbl> <dbl> <fct> #> 1 5 4 1.4 0.2 setosa #> 2 5 3 1.4 0.2 setosa #> 3 5 3 1.3 0.2 setosa #> 4 5 3 1.5 0.2 setosa #> 5 5 4 1.4 0.2 setosa #> 6 5 4 1.7 0.4 setosa #> 7 5 3 1.4 0.3 setosa #> 8 5 3 1.5 0.2 setosa #> 9 4 3 1.4 0.2 setosa #> 10 5 3 1.5 0.1 setosa #> # … with 140 more rows# A purrr-style formula iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), ~ mean(.x, na.rm = TRUE)))#> # A tibble: 3 x 3 #> Species Sepal.Length Sepal.Width #> <fct> <dbl> <dbl> #> 1 setosa 5.01 3.43 #> 2 versicolor 5.94 2.77 #> 3 virginica 6.59 2.97# A named list of functions iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd)))#> # A tibble: 3 x 5 #> Species Sepal.Length_mean Sepal.Length_sd Sepal.Width_mean Sepal.Width_sd #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 0.352 3.43 0.379 #> 2 versicolor 5.94 0.516 2.77 0.314 #> 3 virginica 6.59 0.636 2.97 0.322# Use the .names argument to control the output names iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), mean, .names = "mean_{.col}"))#> # A tibble: 3 x 3 #> Species mean_Sepal.Length mean_Sepal.Width #> <fct> <dbl> <dbl> #> 1 setosa 5.01 3.43 #> 2 versicolor 5.94 2.77 #> 3 virginica 6.59 2.97iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd), .names = "{.col}.{.fn}"))#> # A tibble: 3 x 5 #> Species Sepal.Length.mean Sepal.Length.sd Sepal.Width.mean Sepal.Width.sd #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 0.352 3.43 0.379 #> 2 versicolor 5.94 0.516 2.77 0.314 #> 3 virginica 6.59 0.636 2.97 0.322# When the list is not named, .fn is replaced by the function's position iris %>% group_by(Species) %>% summarise(across(starts_with("Sepal"), list(mean, sd), .names = "{.col}.fn{.fn}"))#> # A tibble: 3 x 5 #> Species Sepal.Length.fn1 Sepal.Length.fn2 Sepal.Width.fn1 Sepal.Width.fn2 #> <fct> <dbl> <dbl> <dbl> <dbl> #> 1 setosa 5.01 0.352 3.43 0.379 #> 2 versicolor 5.94 0.516 2.77 0.314 #> 3 virginica 6.59 0.636 2.97 0.322# if_any() and if_all() ---------------------------------------------------- iris %>% filter(if_any(ends_with("Width"), ~ . > 4))#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 5.7 4.4 1.5 0.4 setosa #> 2 5.2 4.1 1.5 0.1 setosa #> 3 5.5 4.2 1.4 0.2 setosa#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species #> 1 6.3 3.3 6.0 2.5 virginica #> 2 7.1 3.0 5.9 2.1 virginica #> 3 6.5 3.0 5.8 2.2 virginica #> 4 7.6 3.0 6.6 2.1 virginica #> 5 7.2 3.6 6.1 2.5 virginica #> 6 6.8 3.0 5.5 2.1 virginica #> 7 5.8 2.8 5.1 2.4 virginica #> 8 6.4 3.2 5.3 2.3 virginica #> 9 7.7 3.8 6.7 2.2 virginica #> 10 7.7 2.6 6.9 2.3 virginica #> 11 6.9 3.2 5.7 2.3 virginica #> 12 6.7 3.3 5.7 2.1 virginica #> 13 6.4 2.8 5.6 2.1 virginica #> 14 6.4 2.8 5.6 2.2 virginica #> 15 7.7 3.0 6.1 2.3 virginica #> 16 6.3 3.4 5.6 2.4 virginica #> 17 6.9 3.1 5.4 2.1 virginica #> 18 6.7 3.1 5.6 2.4 virginica #> 19 6.9 3.1 5.1 2.3 virginica #> 20 6.8 3.2 5.9 2.3 virginica #> 21 6.7 3.3 5.7 2.5 virginica #> 22 6.7 3.0 5.2 2.3 virginica #> 23 6.2 3.4 5.4 2.3 virginica