dtplyr 1.3.2
CRAN release: 2025-09-10
-
R CMD checkfixes
New features
reframe()is now translated.consecutive_id()is now mapped todata.table::rleid(). Note:rleid()only accepts vector inputs and cannot be used with data frame inputs.case_match()is now translated tofcase().
Minor improvements and bug fixes
case_when(.default = )now works..byno longer alters grouping in prior steps (#439)Arguments to
$and[[calls are no longer prepended with..(#434)Grouping now works with non-standard column names (#451)
print.dtplyr_step()gainsn,max_extra_cols, andmax_footer_linesargs (#464)transmute()preserves row count and avoids unnecessary copies (#470)
dtplyr 1.3.1
CRAN release: 2023-03-22
Fix for failing R CMD check.
dtplyrno longer directly depends oncrayon.
dtplyr 1.3.0
CRAN release: 2023-02-24
New features
.by/byhas been implemented formutate(),summarise(),filter(), and theslice()family (#399).New translations for
add_count(),pick()(#341), andunite().min_rank(),dense_rank(),percent_rank(), &cume_dist()are now mapped to theirdata.tableequivalents (#396).
Minor improvements and bug fixes
dtplyr no longer directly depends on
ellipsis.Chained operations properly prevent modify-by-reference (#210).
across(),if_any(), andif_all()evaluate the.colsargument in the environment from which the function was called.desc()now supports use of.datapronoun inside inarrange()(#346).full_join()now produces output with correctly named columns when a non-default value forsuffixis supplied. Previously thesuffixargument was ignored (#382).if_any()andif_all()now work without specifying the.fnsargument (@mgirlich, #325) and for a list of functions specified in the (@mgirlich, #335).pivot_wider()’snames_gluenow works even whennames_fromcontainsNAs (#394).In
semi_join()theytable is again coerced to a lazy table ifcopy = TRUE(@mgirlich, #322).mutate()can now use.keep.mutate()/summarize()correctly translates anonymous functions (#362).mutate()/transmute()now supportsglue::glue()andstringr::str_glue()without specifying.envir.where()now clearly errors because dtplyr doesn’t support selection by predicate (#271).
dtplyr 1.2.0
CRAN release: 2021-12-05
New authors
@markfairbanks, @mgirlich, and @eutwt are now dtplyr authors in recognition of their significant and sustained contributions. Along with @eutwt, they supplied the bulk of the improvements in this release!
New features
-
dtplyr gains translations for many more tidyr verbs:
ifelse()is mapped tofifelse()(@markfairbanks, #220).
Minor improvements and bug fixes
slice()helpers (slice_head(),slice_tail(),slice_min(),slice_max()andslice_sample()) now accept negative values fornandprop.-
across()defaults toeverything()when.colsisn’t provided (@markfairbanks, #231), and handles named selections (@eutwt #293). It ˜ow handles.fnsarguments in more forms (@eutwt #288):- Anonymous functions, such as
function(x) x + 1 - Formulas which don’t require a function call, such as
~ 1
- Anonymous functions, such as
arrange(dt, desc(col))is translated todt[order(-col)]in order to take advantage of data.table’s fast order (@markfairbanks, #227).count()applied to data.tables no longer breaks when dtplyr is loaded (@mgirlich, #201).case_when()supports use ofTto specify the default (#272).filter()errors for named input, e.g.filter(dt, x = 1)(@mgirlich, #267) and works for negated logical columns (@mgirlich, @211).group_by()ungroups when no grouping variables are specified (@mgirlich, #248), and supports inline mutation likegroup_by(dt, y = x)(@mgirlich, #246).if_else()named arguments are translated to the correct arguments indata.table::fifelse()(@markfairbanks, #234).if_else()supports.dataand.envpronouns (@markfairbanks, #220).if_any()andif_all()default toeverything()when.colsisn’t provided (@eutwt, #294).intersect()/union()/union_all()/setdiff()convert data.table inputs tolazy_dt()(#278).left_join()produces the same column order as dplyr (@markfairbanks, #139).left_join(),right_join(),full_join(), andinner_join()perform a cross join forby = character()(@mgirlich, #242).left_join(),right_join(), andinner_join()are always translated to the[.data.tableequivalent. For simple merges the translation gets a bit longer but thanks to the simpler code base it helps to better handle names inbyand duplicated variables names produced in the data.table join (@mgirlich, #222).mutate()andtransmute()work when called without variables (@mgirlich, #248).mutate()gains new experimental arguments.beforeand.afterthat allow you to control where the new columns are placed (to match dplyr 1.0.0) (@eutwt #291).mutate()can modify grouping columns (instead of creating another column with the same name) (@mgirlich, #246).n_distinct()is translated touniqueN().tally()andcount()follow the dplyr convention of creating a unique name if the default outputname(n) already exists (@eutwt, #295).pivot_wider()names the columns correctly whennames_fromis a numeric column (@mgirlich, #214).slice_*()functions aftergroup_by()are faster (@mgirlich, #216).slice_max()works when ordering by a character column (@mgirlich, #218).summarise()supports the.groupsargument (@mgirlich, #245).summarise(),tally(), andcount()can change the value of a grouping variables (@eutwt, #295).transmute()doesn’t produce duplicate columns when assigning to the same variable (@mgirlich, #249). It correctly flags grouping variables so they selected (@mgirlich, #246).ungroup()removes variables in...from grouping (@mgirlich, #253).
dtplyr 1.1.0
CRAN release: 2021-02-20
New features
All verbs now have (very basic) documentation pointing back to the dplyr generic, and providing a (very rough) description of the translation accompanied with a few examples.
Passing a data.table to a dplyr generic now converts it to a
lazy_dt(), making it a little easier to move between data.table and dplyr syntax.-
dtplyr has been bought up to compatibility with dplyr 1.0.0. This includes new translations for:
slice_min(),slice_max(),slice_head(),slice_tail(), andslice_sample()(#174).
And
rename()andselect()now support dplyr 1.0.0 tidyselect syntax (apart from predicate functions which can’t easily work on lazily evaluated data tables). We have begun the process of adding translations for tidyr verbs beginning with
pivot_wider()(@markfairbanks, #189).
Translation improvements
compute()now creates an intermediate assignment within the translation. This will generally have little impact on performance but it allows you to use intermediate variables to simplify complex translations.case_when()is now translated tofcase()(#190).cur_data()(.SD),cur_group()(.BY),cur_group_id()(.GRP), andcur_group_rows() (.I`) are now tranlsated to their data.table equivalents (#166).filter()on grouped data nows use a much faster translation using on.Irather than.SD(and requiring an intermediate assignment) (#176). Thanks to suggestion from @myoung3 and @ColeMiller1.-
Translation of individual expressions:
x[[1]]is now translated correctly.Anonymous functions are now preserved (@smingerson, #155)
Environment variables used in the
iargument of[.data.tableare now correctly inlined when not in the global environment (#164).TandFare correctly translated toTRUEandFALSE(#140).
Minor improvements and bug fixes
Grouped filter, mutate, and slice no longer affect ordering of output (#178).
as_tibble()gains a.name_repairargument (@markfairbanks).as.data.table()always calls[]so that the result will print (#146).print.lazy_dt()shows total rows, and grouping, if present.group_map()andgroup_walk()are now translated (#108).
dtplyr 1.0.1
CRAN release: 2020-01-23
Better handling for
.dataand.envpronouns (#138).dplyr verbs now work with
NULLinputs (#129).joins do better job at determining output variables in the presence of duplicated outputs (#128). When joining based on different variables in
xandy, joins consistently preserve column fromx, noty(#137).lazy_dt()objects now have a usefulglimpse()method (#132).group_by()now has anarrangeparameter which, if set toFALSE, sets the data.table translation to usebyrather thankeyby(#85).rename()now works withoutdata.tableattached, as intended (@michaelchirico, #123).dtplyr has been re-licensed as MIT (#165).
dtplyr 1.0.0
CRAN release: 2019-11-12
-
Converted from eager approach to lazy approach. You now must use
lazy_dt()to begin a translation pipeline, and must usecollect(),as.data.table(),as.data.frame(), oras_tibble()to finish the translation and actually perform the computation (#38).This represents a complete overhaul of the package replacing the eager evaluation used in the previous releases. This unfortunately breaks all existing code that used dtplyr, but frankly the previous version was extremely inefficient so offered little of data.table’s impressive speed, and was used by very few people.
dtplyr provides methods for data.tables that warning you that they use the data frame implementation and you should use
lazy_dt()(#77)Joins now pass
...on to data.table’s merge method (#41).ungroup()now copies its input (@christophsax, #54).mutate()preserves grouping (@christophsax, #17).if_else()andcoalesce()are mapped to data.table’sfifelse()andfcoalesce()respectively (@michaelchirico, #112).
dtplyr 0.0.3
CRAN release: 2019-02-25
Maintenance release for CRAN checks.
inner_join(),left_join(),right_join(), andfull_join(): newsuffixargument which allows you to control what suffix duplicated variable names receive, as introduced in dplyr 0.5 (#40, @christophsax).Joins use extended
merge.data.table()and theonargument, introduced in data.table 1.9.6. Avoids copy and allows joins by different keys (#20, #21, @christophsax).
dtplyr 0.0.2
CRAN release: 2017-04-21
- This is a compatibility release. It makes dtplyr compatible with dplyr 0.6.0 in addition to dplyr 0.5.0.
