pkgsrc.se | The NetBSD package collection

Subject: CVS commit: pkgsrc/databases/R-dbplyr
From: Makoto Fujiwara
Date: 2024-12-14 14:38:41
Message id: 20241214133841.72D1DFC1C@cvs.NetBSD.org
Log Message:
(databases/R-dbplyr) Updated 2.3.2 to 2.5.0

# dbplyr 2.5.0

## Improved tools for qualified table names

* Specification of table names with schema/catalogs has been overhauled to
  make it simpler. This includes the following features and fixes:

  * The simplest way to refer to a qualified table is now to wrap it in
    `I()`, e.g. `I("schema_name.table_name")`.

  * Use of `sql()` and `ident_q()` inside `in_catalog()` and `in_schema()`
    is once again supported (#1388).

  * It's ok to use `ident_q()` once again (#1413) and you should no longer
    see unsuppressable warnings about using `in_schema()` (#1408).

  * The names of the arguments to `Id()` no longer matter, only their order
    (#1416). Additionally, thanks to changes to the DBI package, you no
    longer need to name each argument.

  * If you accidentally pass a named vector to any of the database identifer
    functions, those names will be automatically stripped (#1404).

  * `tbl_sql(check_from)` is now deprecated.

* dbplyr now exports some tools to work with the internal `table_path` class
  which is useful for certain backends that need to work with this
  data structure (#1300).

## Improved SQL

* New translations for clock functions `add_years()`, `add_days()`,
  `date_build()`, `get_year()`, `get_month()`, `get_day()`,
  and `base::difftime()` on SQL server, Redshift, Snowflake, and Postgres.

* `select()` will keep computed columns used to `arrange()` subqueries that are
  eliminated by a subsequent select (@ejneer, #1437).

* `semi_join()` will no longer inline away an aggregate filter (i.e. `HAVING`
  clause) that was followed by a `select()` (@ejneer, #1474)

* Improved function translations:

  * Functions qualified with the base namespace are now also translated, e.g.
    `base::paste0(x, "_1")` is now translated (@mgirlich, #1022).

  * `-1 + x` now generates a translation instead erroring (#1420).

  * `x$name` never attempts to evaluate `name` (#1368).

  * You can once again use `NULL` on the LHS of an infix operator in order
    to generate SQL with unusual syntax (#1345).

  * Namespaced calls now error if the function doesn't exist, or a translation
    is not available (#1426).

  * `lead()` translation coerces `n` to an integer.

* Databricks: now supports creating non-temporary tables too (#1418).

* Oracle:

  * `db_explain()` now works (@thomashulst, #1353).

  * `as.Date()` works when applied to a string (#1389).

  * `head()` is once again translated to `FETCH FIRST`. This does require Oracle
    12c or newer, but it actually works, compared to the approach using
    `ROWNUM` from #1292 (#1436).

  * Added support for `str_replace()` and `str_replace_all()` via
    `REGEXP_REPLACE()` (@thomashulst, #1402).

* Snowflake (@nathanhaigh, #1406)

  * Added support for `str_starts()` and `str_ends()` via `REGEXP_INSTR()`

  * Refactored `str_detect()` to use `REGEXP_INSTR()` so now supports
    regular expressions.

  * Refactored `grepl()` to use `REGEXP_INSTR()` so now supports
    case-insensitive matching through `grepl(..., ignore.case = TRUE)`

* SQL server:

  * Now products a clear error if you attempt to use `n_distinct()` in
    `mutate()` or `filter()` (#1366).

  * `filter()` does a better job of converting logical vectors
    from bit to boolean (@ejneer, #1288).

* MySQL: `as.integer()` gets correct translation (@krlmlr, #1375).

## Minor improvements and bug fixes

* Deprecation status of functions deprecated in previous versions (at least
  2 years old) have been advanced. In particular, `src_sql()` is now defunct,
  as is the use of `partial_eval()` with character `data`.

* Database errors now show the generated SQL, which hopefully will make it
  faster to track down problems (#1401).

* When dbplyr creates an index on a table in a schema (e.g. `schema.table`),
  it now only includes the table name in the index name, not the schema name.

* The class of remote sources now includes all S4 class names, not just
  the first (#918).

* `compute()` passes additional arguments all the way to
  `sql_query_save()`-methods (@rsund).

* `db_sql_render()` correctly passes on `...` when re-calling with
  `sql_options` set (#1394).

* `reframe()` now gives an informative error that it isn't supported (#1148).

* `rows_patch(in_place = FALSE)` now works when more than one column should be
  patched (@gorcha, #1443).

* New `simulate_mariadb()` (@krlmlr, #1375).

* `sql_translator()` now checks for duplicated definitions (@krlmlr, #1374).

# dbplyr 2.4.0

## Breaking changes

* Using `compute(temporary = FALSE)` without providing a name is now
  deprecated (@mgirlich, #1154).

* `ntile()`'s first argument has been renamed from `order_by` to `x` to
  match the interface of `dplyr::ntile()` (@mgirlich, #1242).

* `simulate_vars()` and `simulate_vars_is_typed()` were removed as they weren't
  used and tidyselect now offers `tidyselect_data_proxy()` and
  `tidyselect_data_has_predicates()` (@mgirllich, #1199).

* `sql_not_supported()` now expects a function name without parentheses.

* `sql_query_append()`, `sql_query_insert()`, `sql_query_update()`,
  `sql_query_upsert()`, and `sql_query_delete()` changed their arguments to
  make them more consistent to the other `sql_query_*()` functions:

  * `x_name` was renamed to `table`.
  * `y` was renamed to `from` and must now be a table identifier or SQL instead
    of a lazy table.
  * `sql_query_append()` and `sql_query_insert()` have gained the argument `cols`.

* `remote_name()` now returns a string with the name of the table. To get the
  qualified identifier use the newly added `remote_table()` (@mgirlich, #1280).

* `tbl_lazy()` loses `src` argument after it has been deprecated for years
  (@mgirlich, #1208).

* `translate_sql()` now requires the `con` argument (@mgirlich, #1311).
  The `vars` argument has been removed after it threw an error for the last 7
  years (@mgirlich).

## Improved SQL

* Preliminary databricks Spark SQL backend (#1377).

* Joins

  * `*_join()` after `full_join()` works again (@mgirlich, #1178).

  * `*_join()` now allows specifying the relationship argument. It must be
    `NULL` or `"many-to-many"` (@bairdj, #1305).

  * Queries now qualify `*` with the table alias for better compatibility
    (@mgirlich, #1003).

  * `full_join()` can now handle column names that only differ in case
    (@ejneer, #1255).

  * The `na_matches` argument of `semi_join()` and `anti_join()` works again
    (@mgirlich, #1211).

  * A `semi/anti_join()` on fitlered `y` is inlined when possible (@mgirlich, #884).

  * Joins now work again for Pool and Oracle connections (@mgirlich, #1177, #1181).

* A sequence of `union()` resp. `union_all()` now produces a flat query
  instead of subqueries (@mgirlich, #1269).

* Added translations for:

  * `nzchar()` (@MichaelChirico, @mgirlich, #1094).
  * `str_detect()`, `str_starts()` and `str_ends()` with fixed patterns
     (@mgirlich, #1009).
  * `runif()` (@mgirlich, #1200).

* `if_any()` and `if_all()` translations are now wrapped in parentheses.
  This makes sure it can be combined via `&` with other conditions
  (@mgirlich, #1153).

* `nth()`, `first()`, and `last()` now support the `na_rm` argument
  (@mgirlich, #1193).

## Minor improvements and bug fixes

* `across()` now supports namespaced functions, e.g.
  `across(x, dplyr::dense_rank)` (@mgirlich, #1231).

* `db_copy_to(overwrite = TRUE)` now actually works.

* `db_copy_to()`'s `...` are now passed to `db_write_table()` (@mgirlich, #1237).

* Added `db_supports_table_alias_with_as()` to customise whether a backend
  supports specifying a table alias with `AS` or not (@mgirlich).

* `db_write_table()` and `db_save_query()` gain the `overwrite` argument.

* `dbplyr_pivot_wider_spec()` is now exported. Unlike `pivot_wider()` this can
  be lazy. Note that this will be removed soon after `pivot_wider_spec()`
  becomes a generic (@mgirlich).

* `filter()`ing with window functions now generates columns called `col01`
  rather than `q01` (@mgirlich, #1258).

* `pivot_wider()` now matches tidyr `NA` column handling (@ejneer #1238).

* `select()` can once again be used after `arrange(desc(x))` (@ejneer, #1240).

* `show_query()` and `remote_query()` gain the argument `sql_options` that allows
  to control how the SQL is generated. It can be created via `sql_options()`
  which has the following arguments:

    * `cte`: use common table expressions?
    * `use_star`: use `SELECT *` or explicitly select every column?
    * `qualify_all_columns`: qualify all columns in a join or only the ambiguous \ 
ones?
  (@mgirlich, #1146).

  Consequently the `cte` argument of `show_query()` and `remote_query()` has
  been deprecated (@mgirlich, #1146).

* `slice_min/max()` can now order by multiple variables like dplyr, e.g. use
  `slice_min(lf, tibble(x, y))` (@mgirlich, #1167).

* `slice_*()` now supports the data masking pronouns `.env` and `.data` \ 
(@mgirlich, #1294).

* `sql_join_suffix()` gains the argument `suffix` so that methods can check
  whether the suffix is valid for the backend (@mgirlich).

* `sql_random()` is now deprecated. It was used to power `slice_sample()` which
  is now done via the translation for `runif()` (@mgirlich, #1200).

* `tbl()` now informs when the user probably forgot to wrap the table identifier
  with `in_schema()` or `sql()` (@mgirlich, #1287).

## Backend specific improvements

* Access
  * Added translation for `!=` to `<>` (@erikvona, #1219).

* DuckDB
  * now supports the `returning` argument of `rows_*()`.

* MySQL/MariaDB:
  * `rows_update()` and `rows_patch()` now give an informative error when the
    unsupported `returning` argument is used (@mgirlich, #1279).
  * `rows_upsert()` now gives an informative error that it isn't supported
    (@mgirlich, #1279).
  * `rows_*()` use the column types of `x` when auto copying `y` (@mgirlich, #1327).
  * `copy_inline()` now works (@mgirlich, #1188).
  * Fix translation of `as.numeric()`, `as.POSIXct()`, `as_datetime()`, and
    `as.integer64()` (@avsdev-cw, #1189).

* MS SQL:
  * `row_number()` now works when no order is specified (@ejneer, @fh-mthomson, \ 
#1332)

* Oracle:
  * Fix translation of `rows_upsert()` (@mgirlich, @TBlackmore, #1286)
  * `head(n)` is now translated to `WHERE ROWNUM <= n` to also support old
    versions <= 11.2 (@JeremyPasco, #1292).

* Postgres
  * The `rows_*()` functions now also work inside a transaction (@mgirlich, #1183).

* SQLite
  * Subqueries now also get an alias. This makes it consistent with other
    backends and simplifies the implementation.

* SQL Server
  * `distinct(.keep_all = TRUE)` now works (@mgirlich, #1053).
  * The translation of `between()` now also works when used in `mutate()`
    (@mgirlich, #1241).
  * `any()` and `all()` now work (@ejneer, #1273).
  * Fixed negation of bit (boolean) fields (@ejneer, #1239)

* Snowflake:
  * `na.rm = TRUE` is now respected in `pmin()` and `pmax()` instead of being \ 
silently ignored (@fh-mthomson, #1329)
  * `row_number()` now works when no order is specified (@fh-mthomson, #1332)

* Teradata
  * `distinct()` + `head()` now work (@mgirlich, #685).
  * `as.Date(x)` is now translate to `CAST(x AS DATE)` again unless `x` is a
    string (@mgirlich, #1285).
  * `row_number()` no longer defaults to partitioning by groups (now aligned \ 
with other databases when no order is specified: `ROW_NUMBER()` defaults to \ 
`ORDER BY (SELECT NULL)`) (@fh-mthomson, #1331)

# dbplyr 2.3.4

* Hot patch release to resolve R CMD check failures.

# dbplyr 2.3.3

* Hot patch to fix R CMD check issues
Files:
Revision	Action	file
1.6	modify	pkgsrc/databases/R-dbplyr/Makefile
1.5	modify	pkgsrc/databases/R-dbplyr/distinfo