Group input by rows

rowwise() allows you to compute on a data frame a row-at-a-time. This is most useful when a vectorised function doesn't exist.

Most dplyr verbs preserve row-wise grouping. The exception is summarise(), which return a grouped_df. You can explicitly ungroup with ungroup() or as_tibble(), or convert to a grouped_df with group_by().

rowwise(data, ...)

Arguments

data

data	Input data frame.
...	<`tidy-select`> Variables to be preserved when calling `summarise()`. This is typically a set of variables whose combination uniquely identify each row. NB: unlike `group_by()` you can not create new variables here but instead you can select multiple variables with (e.g.) `everything()`.

Input data frame.

...

<tidy-select> Variables to be preserved when calling summarise(). This is typically a set of variables whose combination uniquely identify each row.

NB: unlike group_by() you can not create new variables here but instead you can select multiple variables with (e.g.) everything().

Value

A row-wise data frame with class rowwise_df. Note that a rowwise_df is implicitly grouped by row, but is not a grouped_df.

List-columns

Because a rowwise has exactly one row per group it offers a small convenience for working with list-columns. Normally, summarise() and mutate() extract a groups worth of data with [. But when you index a list in this way, you get back another list. When you're working with a rowwise tibble, then dplyr will use [[ instead of [ to make your life a little easier.

Examples

df <- tibble(x = runif(6), y = runif(6), z = runif(6))
# Compute the mean of x, y, z in each row
df %>% rowwise() %>% mutate(m = mean(c(x, y, z)))
#> # A tibble: 6 x 4
#> # Rowwise: 
#>        x      y      z     m
#>    <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.169  0.768  0.569  0.502
#> 2 0.265  0.0762 0.762  0.367
#> 3 0.0638 0.292  0.0537 0.137
#> 4 0.635  0.456  0.225  0.439
#> 5 0.635  0.0706 0.317  0.341
#> 6 0.0981 0.975  0.723  0.599
# use c_across() to more easily select many variables
df %>% rowwise() %>% mutate(m = mean(c_across(x:z)))
#> # A tibble: 6 x 4
#> # Rowwise: 
#>        x      y      z     m
#>    <dbl>  <dbl>  <dbl> <dbl>
#> 1 0.169  0.768  0.569  0.502
#> 2 0.265  0.0762 0.762  0.367
#> 3 0.0638 0.292  0.0537 0.137
#> 4 0.635  0.456  0.225  0.439
#> 5 0.635  0.0706 0.317  0.341
#> 6 0.0981 0.975  0.723  0.599

# Compute the minimum of x and y in each row
df %>% rowwise() %>% mutate(m = min(c(x, y, z)))
#> # A tibble: 6 x 4
#> # Rowwise: 
#>        x      y      z      m
#>    <dbl>  <dbl>  <dbl>  <dbl>
#> 1 0.169  0.768  0.569  0.169 
#> 2 0.265  0.0762 0.762  0.0762
#> 3 0.0638 0.292  0.0537 0.0537
#> 4 0.635  0.456  0.225  0.225 
#> 5 0.635  0.0706 0.317  0.0706
#> 6 0.0981 0.975  0.723  0.0981
# In this case you can use an existing vectorised function:
df %>% mutate(m = pmin(x, y, z))
#> # A tibble: 6 x 4
#>        x      y      z      m
#>    <dbl>  <dbl>  <dbl>  <dbl>
#> 1 0.169  0.768  0.569  0.169 
#> 2 0.265  0.0762 0.762  0.0762
#> 3 0.0638 0.292  0.0537 0.0537
#> 4 0.635  0.456  0.225  0.225 
#> 5 0.635  0.0706 0.317  0.0706
#> 6 0.0981 0.975  0.723  0.0981
# Where these functions exist they'll be much faster than rowwise
# so be on the lookout for them.

# rowwise() is also useful when doing simulations
params <- tribble(
 ~sim, ~n, ~mean, ~sd,
    1,  1,     1,   1,
    2,  2,     2,   4,
    3,  3,    -1,   2
)
# Here I supply variables to preserve after the summary
params %>%
  rowwise(sim) %>%
  summarise(z = rnorm(n, mean, sd))
#> `summarise()` has grouped output by 'sim'. You can override using the `.groups` argument.
#> # A tibble: 6 x 2
#> # Groups:   sim [3]
#>     sim      z
#>   <dbl>  <dbl>
#> 1     1  1.11 
#> 2     2  1.55 
#> 3     2 -3.34 
#> 4     3 -0.649
#> 5     3 -0.513
#> 6     3 -0.306

# If you want one row per simulation, put the results in a list()
params %>%
  rowwise(sim) %>%
  summarise(z = list(rnorm(n, mean, sd)))
#> `summarise()` has grouped output by 'sim'. You can override using the `.groups` argument.
#> # A tibble: 3 x 2
#> # Groups:   sim [3]
#>     sim z        
#>   <dbl> <list>   
#> 1     1 <dbl [1]>
#> 2     2 <dbl [2]>
#> 3     3 <dbl [3]>

Arguments

Value

List-columns

See also

Examples