This is a method for the dplyr distinct()
generic. It is translated to
data.table::unique.data.table()
.
Usage
# S3 method for class 'dtplyr_step'
distinct(.data, ..., .keep_all = FALSE)
Arguments
- .data
- ...
<
data-masking
> Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables in the data frame.- .keep_all
If
TRUE
, keep all variables in.data
. If a combination of...
is not distinct, this keeps the first row of values.
Examples
library(dplyr, warn.conflicts = FALSE)
df <- lazy_dt(data.frame(
x = sample(10, 100, replace = TRUE),
y = sample(10, 100, replace = TRUE)
))
df %>% distinct(x)
#> Source: local data table [10 x 1]
#> Call: unique(`_DT7`[, .(x)])
#>
#> x
#> <int>
#> 1 7
#> 2 5
#> 3 6
#> 4 4
#> 5 9
#> 6 8
#> # ℹ 4 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
df %>% distinct(x, y)
#> Source: local data table [59 x 2]
#> Call: unique(`_DT7`)
#>
#> x y
#> <int> <int>
#> 1 7 4
#> 2 5 2
#> 3 6 9
#> 4 4 2
#> 5 6 3
#> 6 9 3
#> # ℹ 53 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
df %>% distinct(x, .keep_all = TRUE)
#> Source: local data table [10 x 2]
#> Call: unique(`_DT7`, by = "x")
#>
#> x y
#> <int> <int>
#> 1 7 4
#> 2 5 2
#> 3 6 9
#> 4 4 2
#> 5 9 3
#> 6 8 7
#> # ℹ 4 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results