This is a method for the dplyr summarise()
generic. It is translated to
the j
argument of [.data.table
.
Usage
# S3 method for dtplyr_step
summarise(.data, ..., .by = NULL, .groups = NULL)
Arguments
- .data
A
lazy_dt()
.- ...
<
data-masking
> Name-value pairs of summary functions. The name will be the name of the variable in the result.The value can be:
A vector of length 1, e.g.
min(x)
,n()
, orsum(is.na(y))
.A data frame, to add multiple columns from a single expression.
Returning values with size 0 or >1 was deprecated as of 1.1.0. Please use
reframe()
for this instead.- .by
-
<
tidy-select
> Optionally, a selection of columns to group by for just this operation, functioning as an alternative togroup_by()
. For details and examples, see ?dplyr_by. - .groups
Grouping structure of the result.
"drop_last": dropping the last level of grouping. This was the only supported option before version 1.0.0.
"drop": All levels of grouping are dropped.
"keep": Same grouping structure as
.data
."rowwise": Each row is its own group.
When
.groups
is not specified, it is chosen based on the number of rows of the results:If all the results have 1 row, you get "drop_last".
If the number of rows varies, you get "keep" (note that returning a variable number of rows was deprecated in favor of
reframe()
, which also unconditionally drops all levels of grouping).
In addition, a message informs you of that choice, unless the result is ungrouped, the option "dplyr.summarise.inform" is set to
FALSE
, or whensummarise()
is called from a function in a package.
Examples
library(dplyr, warn.conflicts = FALSE)
dt <- lazy_dt(mtcars)
dt %>%
group_by(cyl) %>%
summarise(vs = mean(vs))
#> Source: local data table [3 x 2]
#> Call: `_DT38`[, .(vs = mean(vs)), keyby = .(cyl)]
#>
#> cyl vs
#> <dbl> <dbl>
#> 1 4 0.909
#> 2 6 0.571
#> 3 8 0
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
dt %>%
group_by(cyl) %>%
summarise(across(disp:wt, mean))
#> Source: local data table [3 x 5]
#> Call: `_DT38`[, .(disp = mean(disp), hp = mean(hp), drat = mean(drat),
#> wt = mean(wt)), keyby = .(cyl)]
#>
#> cyl disp hp drat wt
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 4 105. 82.6 4.07 2.29
#> 2 6 183. 122. 3.59 3.12
#> 3 8 353. 209. 3.23 4.00
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results