Log message:
(textproc/R-stringi) Updated 1.4.6 to 1.7.4, make test passes
## 1.7.4 (2021-08-12)
* [BUGFIX] #449: Fixed segfaults generated by `stri_sprintf`.
* [BUILD TIME] No longer defining `USE_RINTERNALS` and `R_NO_REMAP`.
## 1.7.3 (2021-07-15)
* [BUGFIX] Fixed the previous patch of ICU55 causing a build failure on,
amongst others, CRAN's Solaris-based target.
## 1.7.2 (2021-07-14)
* [BUGFIX] Workaround for a bug in `tools::checkFF` failing
when `NA_character_` is passed to `.Call`.
## 1.7.1 (2021-07-14)
* [BACKWARD INCOMPATIBILITY] `%s$%` and `%stri$%` now use the new `stri_sprintf`
(see below) function instead of `base::sprintf`.
* [BACKWARD INCOMPATIBILITY, NEW FEATURE] In `stri_sub<-` and `stri_sub_all<-`,
providing a negative `length` from now on does not result in the corresponding
input string being altered.
* [BACKWARD INCOMPATIBILITY, NEW FEATURE] In `stri_sub` and `stri_sub_all`,
negative `length` results in the corresponding output being `NA`
or not extracted at all, depending on the setting of the new argument
`ignore_negative_length`.
* [BACKWARD INCOMPATIBILITY, BUGFIX, NEW FEATURE] In `stri_subset*`
and their replacement versions, `pattern` and `value` cannot be longer
than `str` (but now they are recycled if necessary).
* [BACKWARD INCOMPATIBILITY, NEW FEATURE] `stri_sub*` now accept the
`from` argument being a matrix like `cbind(from, length=length)`.
Unnamed columns or any other names are still interpreted as `cbind(from, to)`.
Also, the new argument `use_matrix` can be used to disable
the special treatment of such matrices.
* [DOCUMENTATION] It has been clarified that the syntax of `*_charclass`
(e.g., used in `stri_trim*`) differs slightly from regex character
classes.
* [NEW FEATURE] #420: `stri_sprintf` (alias: `stri_string_format`)
is a Unicode-aware replacement for and enhancement of the base `sprintf`:
it adds a customised handling of `NA`s (on demand), computing field size
based on code point width, outputting substrings of at most given width,
variable width and precision (both at the same time), etc. Moreover,
`stri_printf` can be used to display formatted strings conveniently.
* [NEW FEATURE] #153: `stri_match_*_regex` now extract capture group names.
* [NEW FEATURE] #25: `stri_locate_*_regex` now have a new argument,
`capture_groups`, which allows for extracting positions of matches
to parenthesised subexpressions.
* [NEW FEATURE] `stri_locate_*` now have a new argument, `get_length`,
whose setting may result in generating *from-length* matrices
(instead of *from-to* ones).
* [NEW FEATURE] #438: `stri_trans_general` now supports rule-based
as well as reverse-direction transliteration.
* [NEW FEATURE] #434: `stri_datetime_format` and `stri_datetime_parse`
are now vectorised also with respect to the `format` argument.
* [NEW FEATURE] `stri_datetime_fstr` has a new argument, `ignore_special`,
which defaults to `TRUE` for backward compatibility.
* [NEW FEATURE] `stri_datetime_format`, `stri_datetime_add`, and
`stri_datetime_fields` now call `as.POSIXct` more eagerly.
* [NEW FEATURE] `stri_trim*` now have a new argument, `negate`.
* [NEW FEATURE] `stri_replace_rstr` converts `gsub`-style replacement strings
to `stri_replace`-style.
* [INTERNAL] `stri_prepare_arg*` have been refactored, buffer overruns
in the exception handling subsystem are now avoided.
* [BUGFIX] Few functions (`stri_length`, `stri_enc_toutf32`, etc.)
did not throw an exception on an invalid UTF-8
byte sequence (and merely issued a warning instead).
* [BUGFIX] `stri_datetime_fstr` did not honour `NA_character_`
and did not parse format strings such as `"%Y%m%d"` correctly.
It has now been completely rewritten (in C).
* [BUGFIX] `stri_wrap` did not recognise the width of certain Unicode sequences
correctly.
## 1.6.2 (2021-05-14)
* [BACKWARD INCOMPATIBILITY] In `stri_enc_list()`,
`simplify` now defaults to `TRUE`.
* [NEW FEATURE] #425: The outputs of `stri_enc_list()`, `stri_locale_list()`,
`stri_timezone_list()`, and `stri_trans_list()` are now sorted.
* [NEW FEATURE] #428: In `stri_flatten`, `na_empty=NA` now omits missing values.
* [BUILD TIME] #431: Pre-4.9.0 GCC has `::max_align_t`,
but not `std::max_align_t`, added a (possible) workaround, see the `INSTALL`
file.
* [BUGFIX] #429: `stri_width()` misclassified the width of certain
code points (including grave accent, Eszett, etc.);
General category *Sk* (Symbol, modifier) is no longer of width 0,
`UCHAR_EAST_ASIAN_WIDTH` of `U_EA_AMBIGUOUS` is no longer of width 2.
* [BUGFIX] #354: `ALTREP` `CHARSXP`s were not copied, and thus could have been
garbage collected in the so-called meanwhile (with thanks to @jimhester).
## 1.6.1 (2021-05-05)
* [GENERAL] #401: stringi is now bundled with ICU4C 69.1 (upgraded from 61.1),
which is used on most Windows and OS X builds as well as on *nix systems
not equipped with system ICU. However, if the C++11 support is disabled,
stringi will be built against the battle-tested ICU4C 55.1.
The update to ICU brings Unicode 13.0 and CLDR 39 support.
* [DOCUMENTATION] A draft version of a paper on `stringi` is now available at
https://stringi.gagolewski.com/_static/vignette/stringi.pdf
* [GENERAL] stringi now requires R >= 3.1 (`CXX_STD` of `CXX11` or `CXX1X`).
* [NEW FEATURE] #408: `stri_trans_casefold()` performs case folding;
this is different from case mapping, which is locale-dependent.
Folding makes two pieces of text that differ only in case identical.
This can come in handy when comparing strings.
* [NEW FEATURE] #421: `stri_rank()` ranks strings in a character vector
(e.g., for ordering data frames with regards to multiple criteria,
the ranks can be passed to `order()`, see #219).
* [NEW FEATURE] #266: `stri_width()` now supports emojis.
* [NEW FEATURE] `%s$%` and `%stri$%` are now vectorised with respect to
both arguments.
* [BUGFIX] `stri_sort_key()` now outputs `bytes`-encoded strings.
* [BUGFIX] #415: `locale=''` was not equivalent to `locale=NULL`
in `stri_opts_collator()`.
* [INTERNAL] #414: Use `LEVELS(x)` macro instead of accessing `(x)->sxpinfo.gp`
directly (@lukaszdaniel).
## 1.5.3 (2020-09-04)
* [DOCUMENTATION] stringi home page has moved to https://stringi.gagolewski.com
and now includes a comprehensive reference manual.
* [NEW FEATURE] #400: `%s$%` and `%stri$%` are now binary operators
that call base R's `sprintf()`.
* [NEW FEATURE] #399: The `%s*%` and `%stri*%` operators can be used
in addition to `stri_dup()`, for the very same purpose.
* [NEW FEATURE] #355: `stri_opts_regex()` now accepts the `time_limit` and
`stack_limit` options so as to prevent malformed or malicious regexes
from running for too long.
* [NEW FEATURE] #345: `stri_startswith()` and `stri_endswith()` are now equipped
with the `negate` parameter.
* [NEW FEATURE] #382: Incorrect regexes are now reported to ease debugging.
* [DEPRECATION WARNING] #347: Any unknown option passed to `stri_opts_fixed()`,
`stri_opts_regex()`, `stri_opts_coll()`, and `stri_opts_brkiter()` now
generates a warning. In the future, the `...` parameter will be removed,
so that will be an error.
* [DEPRECATION WARNING] `stri_duplicated()`'s `fromLast` argument
has been renamed `from_last`. `fromLast` is now its alias scheduled
for removal in a future version of the package.
* [DEPRECATION WARNING] `stri_enc_detect2()`
is scheduled for removal in a future version of the package.
Use `stri_enc_detect()` or the more targeted `stri_enc_isutf8()`,
`stri_enc_isascii()`, etc., instead.
* [DEPRECATION WARNING] `stri_read_lines()`, `stri_write_lines()`,
`stri_read_raw()`: use `con` argument instead of `fname` now.
The argument `fallback_encoding` is scheduled for removal and is no longer
used. `stri_read_lines()` does not support `encoding="auto"` anymore.
* [DEPRECATION WARNING] `nparagraphs` in `stri_rand_lipsum()` has been renamed
`n_paragraphs`.
* [NEW FEATURE] #398: Alternative, British spelling of function parameters
has been introduced, e.g., `stri_opts_coll()` now supports both
`normalization` and `normalisation`.
* [NEW FEATURE] #393: `stri_read_bin()`, `stri_read_lines()`, and
`stri_write_lines()` are no longer marked as draft API.
* [NEW FEATURE] #187: `stri_read_bin()`, `stri_read_lines()`, and
`stri_write_lines()` now support connection objects as well.
* [NEW FEATURE] #386: New function `stri_sort_key()` for generating
locale-dependent sort keys which can be ordered at the byte level and
return an equivalent ordering to the original string (@DavisVaughan).
* [BUGFIX] #138: `stri_encode()` and `stri_rand_strings()`
now can generate strings of much larger lengths.
* [BUGFIX] `stri_wrap()` did not honour `indent` correctly when
`use_width` was `TRUE`.
|