Path to this page:
Subject: CVS commit: wip/re2
From: Aleksej Saushev
Date: 2011-08-27 20:45:02
Message id: E1QxNse-0002vS-IV@sfs-ml-3.v29.ch3.sourceforge.com
Log Message:
Import RE2 as wip/re2.
RE2 is a fast, safe, thread-friendly alternative to backtracking
regular expression engines like those used in PCRE, Perl, and
Python. It is a C++ library.
Backtracking engines are typically full of features and
convenient syntactic sugar but can be forced into taking
exponential amounts of time on even small inputs. RE2 uses
automata theory to guarantee that regular expression searches
run in time linear in the size of the input. RE2 implements
memory limits, so that searches can be constrained to a fixed
amount of memory. RE2 is engineered to use a small fixed C++
stack footprint no matter what inputs or regular expressions it
must process; thus RE2 is useful in multithreaded environments
where thread stacks cannot grow arbitrarily large.
On large inputs, RE2 is often much faster than backtracking
engines; its use of automata theory lets it apply optimizations
that the others cannot.
Unlike most automata-based engines, RE2 implements almost all
the common Perl and PCRE features and syntactic sugars. It also
finds the leftmost-first match, the same match that Perl would,
and can return submatch information. The one significant
exception is that RE2 drops support for backreferences¡Ï and
generalized zero-width assertions, because they cannot be
implemented efficiently. The syntax page gives full details.
For those who want a simpler syntax, RE2 has a POSIX mode that
accepts only the POSIX egrep operators and implements
leftmost-longest overall matching.
Files: