Path to this page:
Subject: CVS commit: pkgsrc/textproc/icu
From: Adam Ciarcinski
Date: 2017-11-30 17:03:18
Message id: 20171130160318.AF0E0FB40@cvs.NetBSD.org
Log Message:
icu: updated to 60.1
Changes 60.1:
* Unicode 10.0: 8,518 new characters, including four new scripts, 7,494 new Han \
characters, and 56 new emoji characters.
- Properties newly supported in ICU: Emoji_Component, Regional_Indicator, \
Prepended_Concatenation_Mark
* CLDR 32:
- Data for several (mostly Asian) new languages, date formatting patterns \
using colloquial day period formats ("h:mm B" → “1:30 in the \
afternoon”), and many other data improvements.
- See the CLDR download page for other CLDR features and migration issues in \
CLDR 32.
* NumberFormatter, a new number formatting API: A long-overdue refresh of number \
formatting in ICU with a focus on usability, robustness, and performance. The \
30+ settings in DecimalFormat are reduced to 8 in NumberFormatter; all \
NumberFormatter objects are thread-safe and immutable; and the code is efficient \
in both the client-side (constant locale) and server-side (variable locale) use \
cases.
- New users are encouraged to use the new API for number formatting. However, \
preexisting code can continue using the old API, which has been partially made \
into a wrapper over the new API.
- Documentation: in Java, see com.ibm.icu.number.NumberFormatter, and in C++, \
see i18n/unicode/numberformatter.h.
* New options for titlecasing:
- Sentence titlecasing and whole-string titlecasing without custom \
BreakIterator instances.
- The default index adjustment has been changed from "find first cased \
character" to "find first letter, number, or symbol"; a new \
option is available for selecting the previous adjustment behavior.
* Smaller data files for BreakIterator.
- Reverse rules no longer used: Easier updates, easier to conform to Unicode \
Standard.
- Old source rule files continue to work, reverse rules are ignored.
- Rule-based data files: 1.2MB→0.8MB.
ICU4C Specific Changes
* New API for direct-UTF-8 normalization.
- It also optionally records changes, for source-to-result index mapping and \
tracking of text metadata.
* More convenient case mapping API (StringPiece→ByteSink).
* ICU now handles ill-formed UTF-8 byte sequences as specified in the W3C \
Encoding Standard.
Files: