./devel/pcre2, Perl Compatible Regular Expressions library (major version 2)

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 10.32, Package name: pcre2-10.32, Maintainer: pkgsrc-users

PCRE2 is a re-working of the original PCRE library to provide an entirely new

PCRE2 is written in C, and it has its own API. There are three sets of
functions, one for the 8-bit library, which processes strings of bytes, one for
the 16-bit library, which processes strings of 16-bit values, and one for the
32-bit library, which processes strings of 32-bit values. There are no C++

Required to build:

Package options: pcre2-jit

Master sites: (Expand)

SHA1: 31dea762ff549cda09b7df33648f9d4cc3707cf8
RMD160: 708cdaa837ff5cdb5c97db3991d51778a2e9102e
Filesize: 1603.334 KB

Version history: (Expand)

CVS history: (Expand)

   2018-12-02 08:45:03 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
pcre2: update to 10.32.

Version 10.32 10-September-2018

This is another mainly bugfix and tidying release with a few minor
enhancements. These are the main ones:

1. pcre2grep now supports the inclusion of binary zeros in patterns that are
read from files via the -f option.

2. ./configure now supports --enable-jit=auto, which automatically enables JIT
if the hardware supports it.

3. In pcre2_dfa_match(), internal recursive calls no longer use the stack for
local workspace and local ovectors. Instead, an initial block of stack is
reserved, but if this is insufficient, heap memory is used. The heap limit
parameter now applies to pcre2_dfa_match().

4. Updated to Unicode version 11.0.0.

5. (*ACCEPT:ARG), (*FAIL:ARG), and (*COMMIT:ARG) are now supported.

6. Added support for \N{U+dddd}, but only in Unicode mode.

7. Added support for (?^) to unset all imnsx options.
   2018-04-17 10:56:06 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
pcre2: update to 10.31.

Version 10.31 12-February-2018

This is mainly a bugfix and tidying release (see ChangeLog for full details).
However, there are some minor enhancements.

1. New pcre2_config() options: PCRE2_CONFIG_NEVER_BACKSLASH_C and

2. New pcre2_pattern_info() option PCRE2_INFO_EXTRAOPTIONS to retrieve the
extra compile time options.

3. There are now public names for all the pcre2_compile() error numbers.

field callout_flags in callout blocks.
   2017-09-07 16:50:44 by Thomas Klausner | Files touched by this commit (1)
Log message:
Sprinkle some wildcards for arm, I hope it works.
   2017-09-07 15:47:21 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
Make JIT an option. Enable it by default on the list of platforms
martin@ extracted from the source, see PR 52524.

The pattern matching and/or the list itself may need further fixes.

Retest on NetBSD 8 shows that the binaries are now mprotect-safe,
remove variable.

   2017-08-17 21:53:54 by Niclas Rosenvik | Files touched by this commit (4) | Package updated
Log message:
Update pcre2 to version 10.30.

Fixes CVE-2017-8399.
Fixes CVE-2017-7186.
Fixes CVE-2017-8786.

Change Log for PCRE2

Version 10.30 14-August-2017

1. The main interpreter, pcre2_match(), has been refactored into a new version
that does not use recursive function calls (and therefore the stack) for
remembering backtracking positions. This makes --disable-stack-for-recursion a
NOOP. The new implementation allows backtracking into recursive group calls in
patterns, making it more compatible with Perl, and also fixes some other
hard-to-do issues such as #1887 in Bugzilla. The code is also cleaner because
the old code had a number of fudges to try to reduce stack usage. It seems to
run no slower than the old code.

A number of bugs in the refactored code were subsequently fixed during testing
before release, but after the code was made available in the repository. These
bugs were never in fully released code, but are noted here for the record.

  (a) If a pattern had fewer capturing parentheses than the ovector supplied in
      the match data block, a memory error (detectable by ASAN) occurred after
      a match, because the external block was being set from non-existent
      internal ovector fields. Fixes oss-fuzz issue 781.

  (b) A pattern with very many capturing parentheses (when the internal frame
      size was greater than the initial frame vector on the stack) caused a
      crash. A vector on the heap is now set up at the start of matching if the
      vector on the stack is not big enough to handle at least 10 frames.
      Fixes oss-fuzz issue 783.

  (c) Handling of (*VERB)s in recursions was wrong in some cases.

  (d) Captures in negative assertions that were used as conditions were not
      happening if the assertion matched via (*ACCEPT).

  (e) Mark values were not being passed out of recursions.

  (f) Refactor some code in do_callout() to avoid picky compiler warnings about
      negative indices. Fixes oss-fuzz issue 1454.

  (g) Similarly refactor the way the variable length ovector is addressed for
      similar reasons. Fixes oss-fuzz issue 1465.

2. Now that pcre2_match() no longer uses recursive function calls (see above),
the "match limit recursion" value seems misnamed. It still exists, and \ 
the depth of tree that is searched. To avoid future confusion, it has been
renamed as "depth limit" in all relevant places (--with-depth-limit,
(*LIMIT_DEPTH), pcre2_set_depth_limit(), etc) but the old names are still
available for backwards compatibility.

3. Hardened pcre2test so as to reduce the number of bugs reported by fuzzers:

  (a) Check for malloc failures when getting memory for the ovector (POSIX) or
      the match data block (non-POSIX).

4. In the 32-bit library in non-UTF mode, an attempt to find a Unicode property
for a character with a code point greater than 0x10ffff (the Unicode maximum)
caused a crash.

5. If a lookbehind assertion that contained a back reference to a group
appearing later in the pattern was compiled with the PCRE2_ANCHORED option,
undefined actions (often a segmentation fault) could occur, depending on what
other options were set. An example assertion is (?<!\1(abc)) where the
reference \1 precedes the group (abc). This fixes oss-fuzz issue 865.

6. Added the PCRE2_INFO_FRAMESIZE item to pcre2_pattern_info() and arranged for
pcre2test to use it to output the frame size when the "framesize" \ 
modifier is

7. Reworked the recursive pattern matching in the JIT compiler to follow the
interpreter changes.

8. When the zero_terminate modifier was specified on a pcre2test subject line
for global matching, unpredictable things could happen. For example, in UTF-8
mode, the pattern //g,zero_terminate read random memory when matched against an
empty string with zero_terminate. This was a bug in pcre2test, not the library.

9. Moved some Windows-specific code in pcre2grep (introduced in 10.23/13) out
of the section that is compiled when Unix-style directory scanning is
available, and into a new section that is always compiled for Windows.

10. In pcre2test, explicitly close the file after an error during serialization
or deserialization (the "load" or "save" commands).

11. Fix memory leak in pcre2_serialize_decode() when the input is invalid.

12. Fix potential NULL dereference in pcre2_callout_enumerate() if called with
a NULL pattern pointer when Unicode support is available.

13. When the 32-bit library was being tested by pcre2test, error messages that
were longer than 64 code units could cause a buffer overflow. This was a bug in

14. The alternative matching function, pcre2_dfa_match() misbehaved if it
encountered a character class with a possessive repeat, for example [a-f]{3}+.

15. The depth (formerly recursion) limit now applies to DFA matching (as
of 10.23/36); pcre2test has been upgraded so that \=find_limits works with DFA
matching to find the minimum value for this limit.

16. Since 10.21, if pcre2_match() was called with a null context, default
memory allocation functions were used instead of whatever was used when the
pattern was compiled.

17. Changes to the pcre2test "memory" modifier on a subject line. \ 
These apply
only to pcre2_match():

  (a) Warn if null_context is set on both pattern and subject, because the
      memory details cannot then be shown.

  (b) Remember (up to a certain number of) memory allocations and their
      lengths, and list only the lengths, so as to be system-independent.
      (In practice, the new interpreter never has more than 2 blocks allocated

18. Make pcre2test detect an error return from pcre2_get_error_message(), give
a message, and abandon the run (this would have detected #13 above).

19. Implemented PCRE2_ENDANCHORED.

20. Applied Jason Hood's patches (slightly modified) to pcre2grep, to implement
the --output=text (-O) option and the inbuilt callout echo.

21. Extend auto-anchoring etc. to ignore groups with a zero qualifier and
single-branch conditions with a false condition (e.g. DEFINE) at the start of a
branch. For example, /(?(DEFINE)...)^A/ and /(...){0}^B/ are now flagged as

22. Added an explicit limit on the amount of heap used by pcre2_match(), set by
pcre2_set_heap_limit() or (*LIMIT_HEAP=xxx). Upgraded pcre2test to show the
heap limit along with other pattern information, and to find the minimum when
the find_limits modifier is set.

23. Write to the last 8 bytes of the pcre2_real_code structure when a compiled
pattern is set up so as to initialize any padding the compiler might have
included. This avoids valgrind warnings when a compiled pattern is copied, in
particular when it is serialized.

24. Remove a redundant line of code left in accidentally a long time ago.

25. Remove a duplication typo in pcre2_tables.c

26. Correct an incorrect cast in pcre2_valid_utf.c

27. Update pcre2test, remove some unused code in pcre2_match(), and upgrade the
tests to improve coverage.

28. Some fixes/tidies as a result of looking at Coverity Scan output:

    (a) Typo: ">" should be ">=" in opcode check in \ 
    (b) Added some casts to avoid "suspicious implicit sign extension".
    (c) Resource leaks in pcre2test in rare error cases.
    (d) Avoid warning for never-use case OP_TABLE_LENGTH which is just a fudge
        for checking at compile time that tables are the right size.
    (e) Add missing "fall through" comment.

29. Implemented PCRE2_EXTENDED_MORE and related /xx and (?xx) features.

30. Implement (?n: for PCRE2_NO_AUTO_CAPTURE, because Perl now has this.

31. If more than one of "push", "pushcopy", or \ 
"pushtablescopy" were set in
pcre2test, a crash could occur.

32. Make -bigstack in RunTest allocate a 64Mb stack (instead of 16 MB) so that
all the tests can run with clang's sanitizing options.

33. Implement extra compile options in the compile context and add the first

34. Implement newline type PCRE2_NEWLINE_NUL.

35. A lookbehind assertion that had a zero-length branch caused undefined
behaviour when processed by pcre2_dfa_match(). This is oss-fuzz issue 1859.

36. The match limit value now also applies to pcre2_dfa_match() as there are
patterns that can use up a lot of resources without necessarily recursing very
deeply. (Compare item 10.23/36.) This should fix oss-fuzz #1761.


38. Fix returned offsets from regexec() when REG_STARTEND is used with a
starting offset greater than zero.

39. Implement REG_PEND (GNU extension) for the POSIX wrapper.

40. Implement the subject_literal modifier in pcre2test, and allow jitstack on
pattern lines.

41. Implement PCRE2_LITERAL and use it to support REG_NOSPEC.

42. Implement PCRE2_EXTRA_MATCH_LINE and PCRE2_EXTRA_MATCH_WORD for the benefit
of pcre2grep.

43. Re-implement pcre2grep's -F, -w, and -x options using PCRE2_LITERAL,

    (a) The -F option did not work for fixed strings containing \E.
    (b) The -w option did not work for patterns with multiple branches.

44. Added configuration options for the SELinux compatible execmem allocator in

45. Increased the limit for searching for a "must be present" code unit in
subjects from 1000 to 2000 for 8-bit searches, since they use memchr() and are
much faster.

46. Arrange for anchored patterns to record and use "first code unit" data,
because this can give a fast "no match" without searching for a \ 
"required code
unit". Previously only non-anchored patterns did this.

47. Upgraded the Unicode tables from Unicode 8.0.0 to Unicode 10.0.0.

48. Add the callout_no_where modifier to pcre2test.

49. Update extended grapheme breaking rules to the latest set that are in
Unicode Standard Annex #29.

50. Added experimental foreign pattern conversion facilities
(pcre2_pattern_convert() and friends).

51. Change the macro FWRITE, used in pcre2grep, to FWRITE_IGNORE because FWRITE
is defined in a system header in cygwin. Also modified some of the #ifdefs in
pcre2grep related to Windows and Cygwin support.

52. Change 3(g) for 10.23 was a bit too zealous. If a hyphen that follows a
character class is the last character in the class, Perl does not give a
warning. PCRE2 now also treats this as a literal.

53. Related to 52, though PCRE2 was throwing an error for [[:digit:]-X] it was
not doing so for [\d-X] (and similar escapes), as is documented.

54. Fixed a MIPS issue in the JIT compiler reported by Joshua Kinard.

55. Fixed a "maybe uninitialized" warning for class_uchardata in \p \ 
handling in
pcre2_compile() which could never actually trigger (code should have been cut
out when Unicode support is disabled).
   2017-02-20 10:44:34 by Thomas Klausner | Files touched by this commit (3) | Package updated
Log message:
Updated pcre2 to 10.23.

Version 10.23 14-February-2017

1. ChangeLog has the details of a lot of bug fixes and tidies.

2. There has been a major re-factoring of the pcre2_compile.c file. Most syntax
checking is now done in the pre-pass that identifies capturing groups. This has
reduced the amount of duplication and made the code tidier. While doing this,
some minor bugs and Perl incompatibilities were fixed (see ChangeLog for

3. Back references are now permitted in lookbehind assertions when there are
no duplicated group numbers (that is, (?| has not been used), and, if the
reference is by name, there is only one group of that name. The referenced
group must, of course be of fixed length.

4. \g{+<number>} (e.g. \g{+2} ) is now supported. It is a "forward back
reference" and can be useful in repetitions (compare \g{-<number>} ). \ 
Perl does
not recognize this syntax.

5. pcre2grep now automatically expands its buffer up to a maximum set by

6. The -t option (grand total) has been added to pcre2grep.

7. A new function called pcre2_code_copy_with_tables() exists to copy a
compiled pattern along with a private copy of the character tables that is

8. A user supplied a number of patches to upgrade pcre2grep under Windows and
tidy the code.

9. Several updates have been made to pcre2test and test scripts (see
   2017-01-19 19:52:30 by Alistair G. Crooks | Files touched by this commit (352)
Log message:
Convert all occurrences (353 by my count) of

	MASTER_SITES= 	site1 \

style continuation lines to be simple repeated


lines. As previewed on tech-pkg. With thanks to rillig for fixing pkglint
   2016-10-01 12:21:03 by Thomas Klausner | Files touched by this commit (1) | Package updated
Log message: