count()
lets you quickly count the unique values of one or more variables:
df %>% count(a, b)
is roughly equivalent to
df %>% group_by(a, b) %>% summarise(n = n())
.
count()
is paired with tally()
, a lower-level helper that is equivalent
to df %>% summarise(n = n())
. Supply wt
to perform weighted counts,
switching the summary from n = n()
to n = sum(wt)
.
add_count()
and add_tally()
are equivalents to count()
and tally()
but use mutate()
instead of summarise()
so that they add a new column
with group-wise counts.
count(x, ..., wt = NULL, sort = FALSE, name = NULL) tally(x, wt = NULL, sort = FALSE, name = NULL) add_count(x, ..., wt = NULL, sort = FALSE, name = NULL, .drop = deprecated()) add_tally(x, wt = NULL, sort = FALSE, name = NULL)
x | A data frame, data frame extension (e.g. a tibble), or a lazy data frame (e.g. from dbplyr or dtplyr). |
---|---|
... | < |
wt | <
|
sort | If |
name | The name of the new column in the output. If omitted, it will default to |
.drop | For |
An object of the same type as .data
. count()
and add_count()
group transiently, so the output has the same groups as the input.
# count() is a convenient way to get a sense of the distribution of # values in a dataset starwars %>% count(species)#> # A tibble: 38 x 2 #> species n #> <chr> <int> #> 1 Aleena 1 #> 2 Besalisk 1 #> 3 Cerean 1 #> 4 Chagrian 1 #> 5 Clawdite 1 #> 6 Droid 6 #> 7 Dug 1 #> 8 Ewok 1 #> 9 Geonosian 1 #> 10 Gungan 3 #> # … with 28 more rowsstarwars %>% count(species, sort = TRUE)#> # A tibble: 38 x 2 #> species n #> <chr> <int> #> 1 Human 35 #> 2 Droid 6 #> 3 NA 4 #> 4 Gungan 3 #> 5 Kaminoan 2 #> 6 Mirialan 2 #> 7 Twi'lek 2 #> 8 Wookiee 2 #> 9 Zabrak 2 #> 10 Aleena 1 #> # … with 28 more rowsstarwars %>% count(sex, gender, sort = TRUE)#> # A tibble: 6 x 3 #> sex gender n #> <chr> <chr> <int> #> 1 male masculine 60 #> 2 female feminine 16 #> 3 none masculine 5 #> 4 NA NA 4 #> 5 hermaphroditic masculine 1 #> 6 none feminine 1#> # A tibble: 15 x 2 #> birth_decade n #> <dbl> <int> #> 1 10 1 #> 2 20 6 #> 3 30 4 #> 4 40 6 #> 5 50 8 #> 6 60 4 #> 7 70 4 #> 8 80 2 #> 9 90 3 #> 10 100 1 #> 11 110 1 #> 12 200 1 #> 13 600 1 #> 14 900 1 #> 15 NA 44# use the `wt` argument to perform a weighted count. This is useful # when the data has already been aggregated once df <- tribble( ~name, ~gender, ~runs, "Max", "male", 10, "Sandra", "female", 1, "Susan", "female", 4 ) # counts rows: df %>% count(gender)#> # A tibble: 2 x 2 #> gender n #> <chr> <int> #> 1 female 2 #> 2 male 1# counts runs: df %>% count(gender, wt = runs)#> # A tibble: 2 x 2 #> gender n #> <chr> <dbl> #> 1 female 5 #> 2 male 10# tally() is a lower-level function that assumes you've done the grouping starwars %>% tally()#> # A tibble: 1 x 1 #> n #> <int> #> 1 87#> # A tibble: 38 x 2 #> species n #> <chr> <int> #> 1 Aleena 1 #> 2 Besalisk 1 #> 3 Cerean 1 #> 4 Chagrian 1 #> 5 Clawdite 1 #> 6 Droid 6 #> 7 Dug 1 #> 8 Ewok 1 #> 9 Geonosian 1 #> 10 Gungan 3 #> # … with 28 more rows# both count() and tally() have add_ variants that work like # mutate() instead of summarise df %>% add_count(gender, wt = runs)#> # A tibble: 3 x 4 #> name gender runs n #> <chr> <chr> <dbl> <dbl> #> 1 Max male 10 10 #> 2 Sandra female 1 5 #> 3 Susan female 4 5df %>% add_tally(wt = runs)#> # A tibble: 3 x 4 #> name gender runs n #> <chr> <chr> <dbl> <dbl> #> 1 Max male 10 15 #> 2 Sandra female 1 15 #> 3 Susan female 4 15