Skip to content

This is a method for the dplyr mutate() generic. It is translated to the j argument of [.data.table, using := to modify "in place". If .before or .after is provided, the new columns are relocated with a call to data.table::setcolorder().

Usage

# S3 method for dtplyr_step
mutate(
  .data,
  ...,
  .by = NULL,
  .keep = c("all", "used", "unused", "none"),
  .before = NULL,
  .after = NULL
)

Arguments

.data

A lazy_dt().

...

<data-masking> Name-value pairs. The name gives the name of the column in the output.

The value can be:

  • A vector of length 1, which will be recycled to the correct length.

  • A vector the same length as the current group (or the whole data frame if ungrouped).

  • NULL, to remove the column.

  • A data frame or tibble, to create multiple columns in the output.

.by

[Experimental]

<tidy-select> Optionally, a selection of columns to group by for just this operation, functioning as an alternative to group_by(). For details and examples, see ?dplyr_by.

.keep

Control which columns from .data are retained in the output. Grouping columns and columns created by ... are always kept.

  • "all" retains all columns from .data. This is the default.

  • "used" retains only the columns used in ... to create new columns. This is useful for checking your work, as it displays inputs and outputs side-by-side.

  • "unused" retains only the columns not used in ... to create new columns. This is useful if you generate new columns, but no longer need the columns used to generate them.

  • "none" doesn't retain any extra columns from .data. Only the grouping variables and columns created by ... are kept.

Note: With dtplyr .keep will only work with column names passed as symbols, and won't work with other workflows (e.g. eval(parse(text = "x + 1")))

.before, .after

<tidy-select> Optionally, control where new columns should appear (the default is to add to the right hand side). See relocate() for more details.

Examples

library(dplyr, warn.conflicts = FALSE)

dt <- lazy_dt(data.frame(x = 1:5, y = 5:1))
dt %>%
  mutate(a = (x + y) / 2, b = sqrt(x^2 + y^2))
#> Source: local data table [5 x 4]
#> Call:   copy(`_DT24`)[, `:=`(a = (x + y)/2, b = sqrt(x^2 + y^2))]
#> 
#>       x     y     a     b
#>   <int> <int> <dbl> <dbl>
#> 1     1     5     3  5.10
#> 2     2     4     3  4.47
#> 3     3     3     3  4.24
#> 4     4     2     3  4.47
#> 5     5     1     3  5.10
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results

# It uses a more sophisticated translation when newly created variables
# are used in the same expression
dt %>%
  mutate(x1 = x + 1, x2 = x1 + 1)
#> Source: local data table [5 x 4]
#> Call:   copy(`_DT24`)[, `:=`(c("x1", "x2"), {
#>     x1 <- x + 1
#>     x2 <- x1 + 1
#>     .(x1, x2)
#> })]
#> 
#>       x     y    x1    x2
#>   <int> <int> <dbl> <dbl>
#> 1     1     5     2     3
#> 2     2     4     3     4
#> 3     3     3     4     5
#> 4     4     2     5     6
#> 5     5     1     6     7
#> 
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results