This is a method for the dplyr distinct()
generic. It is translated to
data.table::unique.data.table()
.
Usage
# S3 method for dtplyr_step
distinct(.data, ..., .keep_all = FALSE)
Arguments
- .data
- ...
<
data-masking
> Optional variables to use when determining uniqueness. If there are multiple rows for a given combination of inputs, only the first row will be preserved. If omitted, will use all variables in the data frame.- .keep_all
If
TRUE
, keep all variables in.data
. If a combination of...
is not distinct, this keeps the first row of values.
Examples
library(dplyr, warn.conflicts = FALSE)
df <- lazy_dt(data.frame(
x = sample(10, 100, replace = TRUE),
y = sample(10, 100, replace = TRUE)
))
df %>% distinct(x)
#> Source: local data table [10 x 1]
#> Call: unique(`_DT7`[, .(x)])
#>
#> x
#> <int>
#> 1 10
#> 2 4
#> 3 1
#> 4 5
#> 5 7
#> 6 2
#> # … with 4 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
df %>% distinct(x, y)
#> Source: local data table [64 x 2]
#> Call: unique(`_DT7`)
#>
#> x y
#> <int> <int>
#> 1 10 8
#> 2 4 3
#> 3 1 8
#> 4 1 6
#> 5 5 7
#> 6 7 10
#> # … with 58 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results
df %>% distinct(x, .keep_all = TRUE)
#> Source: local data table [10 x 2]
#> Call: unique(`_DT7`, by = "x")
#>
#> x y
#> <int> <int>
#> 1 10 8
#> 2 4 3
#> 3 1 8
#> 4 5 7
#> 5 7 10
#> 6 2 7
#> # … with 4 more rows
#>
#> # Use as.data.table()/as.data.frame()/as_tibble() to access results