Log message:
update to pyparsing-2.1.8
==========
Change Log
==========
Version 2.1.8 -
------------------------------
- Fixed issue in the optimization to _trim_arity, when the full
stacktrace is retrieved to determine if a TypeError is raised in
pyparsing or in the caller's parse action. Code was traversing
the full stacktrace, and potentially encountering UnicodeDecodeError.
- Fixed bug in ParserElement.inlineLiteralsUsing, causing infinite
loop with Suppress.
- Fixed bug in Each, when merging named results from multiple
expressions in a ZeroOrMore or OneOrMore. Also fixed bug when
ZeroOrMore expressions were erroneously treated as required
expressions in an Each expression.
- Added a few more inline doc examples.
- Improved use of runTests in several example scripts.
Version 2.1.7 -
------------------------------
- Fixed regression reported by Andrea Censi (surfaced in PyContracts
tests) when using ParseSyntaxExceptions (raised when using operator '-')
with packrat parsing.
- Minor fix to oneOf, to accept all iterables, not just space-delimited
strings and lists. (If you have a list or set of strings, it is
not necessary to concat them using ' '.join to pass them to oneOf,
oneOf will accept the list or set or generator directly.)
Version 2.1.6 -
------------------------------
- *Major packrat upgrade*, inspired by patch provided by Tal Einat -
many, many, thanks to Tal for working on this! Tal's tests show
faster parsing performance (2X in some tests), *and* memory reduction
from 3GB down to ~100MB! Requires no changes to existing code using
packratting. (Uses OrderedDict, available in Python 2.7 and later.
For Python 2.6 users, will attempt to import from ordereddict
backport. If not present, will implement pure-Python Fifo dict.)
- Minor API change - to better distinguish between the flexible
numeric types defined in pyparsing_common, I've changed "numeric"
(which parsed numbers of different types and returned int for ints,
float for floats, etc.) and "number" (which parsed numbers of int
or float type, and returned all floats) to "number" and \
"fnumber"
respectively. I hope the "f" prefix of "fnumber" will be a \
better
indicator of its internal conversion of parsed values to floats,
while the generic "number" is similar to the flexible number syntax
in other languages. Also fixed a bug in pyparsing_common.numeric
(now renamed to pyparsing_common.number), integers were parsed and
returned as floats instead of being retained as ints.
- Fixed bug in upcaseTokens and downcaseTokens introduced in 2.1.5,
when the parse action was used in conjunction with results names.
Reported by Steven Arcangeli from the dql project, thanks for your
patience, Steven!
- Major change to docs! After seeing some comments on reddit about
general issue with docs of Python modules, and thinking that I'm a
little overdue in doing some doc tuneup on pyparsing, I decided to
following the suggestions of the redditor and add more inline examples
to the pyparsing reference documentation. I hope this addition
will clarify some of the more common questions people have, especially
when first starting with pyparsing/Python.
- Deprecated ParseResults.asXML. I've never been too happy with this
method, and it usually forces some unnatural code in the parsers in
order to get decent tag names. The amount of guesswork that asXML
has to do to try to match names with values should have been a red
flag from day one. If you are using asXML, you will need to implement
your own ParseResults->XML serialization. Or consider migrating to
a more current format such as JSON (which is very easy to do:
results_as_json = json.dumps(parse_result.asDict()) Hopefully, when
I remove this code in a future version, I'll also be able to simplify
some of the craziness in ParseResults, which IIRC was only there to try
to make asXML work.
- Updated traceParseAction parse action decorator to show the repr
of the input and output tokens, instead of the str format, since
str has been simplified to just show the token list content.
(The change to ParseResults.__str__ occurred in pyparsing 2.0.4, but
it seems that didn't make it into the release notes - sorry! Too
many users, especially beginners, were confused by the
"([token_list], {names_dict})" str format for ParseResults, thinking
they were getting a tuple containing a list and a dict. The full form
can be seen if using repr().)
For tracing tokens in and out of parse actions, the more complete
repr form provides important information when debugging parse actions.
Verison 2.1.5 - June, 2016
------------------------------
- Added ParserElement.split() generator method, similar to re.split().
Includes optional arguments maxsplit (to limit the number of splits),
and includeSeparators (to include the separating matched text in the
returned output, default=False).
- Added a new parse action construction helper tokenMap, which will
apply a function and optional arguments to each element in a
ParseResults. So this parse action:
def lowercase_all(tokens):
return [str(t).lower() for t in tokens]
OneOrMore(Word(alphas)).setParseAction(lowercase_all)
can now be written:
OneOrMore(Word(alphas)).setParseAction(tokenMap(str.lower))
Also simplifies writing conversion parse actions like:
integer = Word(nums).setParseAction(lambda t: int(t[0]))
to just:
integer = Word(nums).setParseAction(tokenMap(int))
If additional arguments are necessary, they can be included in the
call to tokenMap, as in:
hex_integer = Word(hexnums).setParseAction(tokenMap(int, 16))
- Added more expressions to pyparsing_common:
. IPv4 and IPv6 addresses (including long, short, and mixed forms
of IPv6)
. MAC address
. ISO8601 date and date time strings (with named fields for year, month, etc.)
. UUID (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)
. hex integer (returned as int)
. fraction (integer '/' integer, returned as float)
. mixed integer (integer '-' fraction, or just fraction, returned as float)
. stripHTMLTags (parse action to remove tags from HTML source)
. parse action helpers convertToDate and convertToDatetime to do custom parse
time conversions of parsed ISO8601 strings
- runTests now returns a two-tuple: success if all tests succeed,
and an output list of each test and its output lines.
- Added failureTests argument (default=False) to runTests, so that
tests can be run that are expected failures, and runTests' success
value will return True only if all tests *fail* as expected. Also,
parseAll now defaults to True.
- New example numerics.py, shows samples of parsing integer and real
numbers using locale-dependent formats:
4.294.967.295,000
4 294 967 295,000
4,294,967,295.000
Version 2.1.4 - May, 2016
------------------------------
- Split out the '==' behavior in ParserElement, now implemented
as the ParserElement.matches() method. Using '==' for string test
purposes will be removed in a future release.
- Expanded capabilities of runTests(). Will now accept embedded
comments (default is Python style, leading '#' character, but
customizable). Comments will be emitted along with the tests and
test output. Useful during test development, to create a test string
consisting only of test case description comments separated by
blank lines, and then fill in the test cases. Will also highlight
ParseFatalExceptions with "(FATAL)".
- Added a 'pyparsing_common' class containing common/helpful little
expressions such as integer, float, identifier, etc. I used this
class as a sort of embedded namespace, to contain these helpers
without further adding to pyparsing's namespace bloat.
- Minor enhancement to traceParseAction decorator, to retain the
parse action's name for the trace output.
- Added optional 'fatal' keyword arg to addCondition, to indicate that
a condition failure should halt parsing immediately.
Version 2.1.3 - May, 2016
------------------------------
- _trim_arity fix in 2.1.2 was very version-dependent on Py 3.5.0.
Now works for Python 2.x, 3.3, 3.4, 3.5.0, and 3.5.1 (and hopefully
beyond).
Version 2.1.2 - May, 2016
------------------------------
- Fixed bug in _trim_arity when pyparsing code is included in a
PyInstaller, reported by maluwa.
- Fixed catastrophic regex backtracking in implementation of the
quoted string expressions (dblQuotedString, sglQuotedString, and
quotedString). Reported on the pyparsing wiki by webpentest,
good catch! (Also tuned up some other expressions susceptible to the
same backtracking problem, such as cStyleComment, cppStyleComment,
etc.)
Version 2.1.1 - March, 2016
---------------------------
- Added support for assigning to ParseResults using slices.
- Fixed bug in ParseResults.toDict(), in which dict values were always
converted to dicts, even if they were just unkeyed lists of tokens.
Reported on SO by Gerald Thibault, thanks Gerald!
- Fixed bug in SkipTo when using failOn, reported by robyschek, thanks!
- Fixed bug in Each introduced in 2.1.0, reported by AND patch and
unit test submitted by robyschek, well done!
- Removed use of functools.partial in replaceWith, as this creates
an ambiguous signature for the generated parse action, which fails in
PyPy. Reported by Evan Hubinger, thanks Evan!
- Added default behavior to QuotedString to convert embedded '\t', '\n',
etc. characters to their whitespace counterparts. Found during Q&A
exchange on SO with Maxim.
|
Log message:
Update devel/py-pyparsing to 2.1.0.
Changes:
Version 2.1.0 - February, 2016
------------------------------
- Modified the internal _trim_arity method to distinguish between
TypeError's raised while trying to determine parse action arity and
those raised within the parse action itself. This will clear up those
confusing "<lambda>() takes exactly 1 argument (0 given)" error
messages when there is an actual TypeError in the body of the parse
action. Thanks to all who have raised this issue in the past, and
most recently to Michael Cohen, who sent in a proposed patch, and got
me to finally tackle this problem.
- Added compatibility for pickle protocols 2-4 when pickling ParseResults.
In Python 2.x, protocol 0 was the default, and protocol 2 did not work.
In Python 3.x, protocol 3 is the default, so explicitly naming
protocol 0 or 1 was required to pickle ParseResults. With this release,
all protocols 0-4 are supported. Thanks for reporting this on StackOverflow,
Arne Wolframm, and for providing a nice simple test case!
- Added optional 'stopOn' argument to ZeroOrMore and OneOrMore, to
simplify breaking on stop tokens that would match the repetition
expression.
It is a common problem to fail to look ahead when matching repetitive
tokens if the sentinel at the end also matches the repetition
expression, as when parsing "BEGIN aaa bbb ccc END" with:
"BEGIN" + OneOrMore(Word(alphas)) + "END"
Since "END" matches the repetition expression \
"Word(alphas)", it will
never get parsed as the terminating sentinel. Up until now, this has
to be resolved by the user inserting their own negative lookahead:
"BEGIN" + OneOrMore(~Literal("END") + Word(alphas)) + \
"END"
Using stopOn, they can more easily write:
"BEGIN" + OneOrMore(Word(alphas), stopOn="END") + \
"END"
The stopOn argument can be a literal string or a pyparsing expression.
Inspired by a question by Lamakaha on StackOverflow (and many previous
questions with the same negative-lookahead resolution).
- Added expression names for many internal and builtin expressions, to
reduce name and error message overhead during parsing.
- Converted helper lambdas to functions to refactor and add docstring
support.
- Fixed ParseResults.asDict() to correctly convert nested ParseResults
values to dicts.
- Cleaned up some examples, fixed typo in fourFn.py identified by
aristotle2600 on reddit.
- Removed keepOriginalText helper method, which was deprecated ages ago.
Superceded by originalTextFor.
- Same for the Upcase class, which was long ago deprecated and replaced
with the upcaseTokens method.
Version 2.0.7 - December, 2015
------------------------------
- Simplified string representation of Forward class, to avoid memory
and performance errors while building ParseException messages. Thanks,
Will McGugan, Andrea Censi, and Martijn Vermaat for the bug reports and
test code.
- Cleaned up additional issues from enhancing the error messages for
Or and MatchFirst, handling Unicode values in expressions. Fixes Unicode
encoding issues in Python 2, thanks to Evan Hubinger for the bug report.
- Fixed implementation of dir() for ParseResults - was leaving out all the
defined methods and just adding the custom results names.
- Fixed bug in ignore() that was introduced in pyparsing 1.5.3, that would
not accept a string literal as the ignore expression.
- Added new example parseTabularData.py to illustrate parsing of data
formatted in columns, with detection of empty cells.
- Updated a number of examples to more current Python and pyparsing
forms.
|
Log message:
Update devel/py-pyparsing to 2.0.6.
pkgsrc changes:
- convert to egg.mk (no functional changes intended)
- use MASTER_SITE_PYPI
- update HOMEPAGE (sf.net project page seems not available ATM while the
distfiles are fetchable from sf.net)
- minor cosmetic fixes
Changes:
Version 2.0.6 -
---------------------------
- Fixed a bug in Each when multiple Optional elements are present.
Thanks for reporting this, whereswalden on SO.
- Fixed another bug in Each, when Optional elements have results names
or parse actions, reported by Max Rothman - thank you, Max!
- Added optional parseAll argument to runTests, whether tests should
require the entire input string to be parsed or not (similar to
parseAll argument to parseString). Plus a little neaten-up of the
output on Python 2 (no stray ()'s).
- Modified exception messages from MatchFirst and Or expressions. These
were formerly misleading as they would only give the first or longest
exception mismatch error message. Now the error message includes all
the alternatives that were possible matches. Originally proposed by
a pyparsing user, but I've lost the email thread - finally figured out
a fairly clean way to do this.
- Fixed a bug in Or, when a parse action on an alternative raises an
exception, other potentially matching alternatives were not always tried.
Reported by TheVeryOmni on the pyparsing wiki, thanks!
- Fixed a bug to dump() introduced in 2.0.4, where list values were shown
in duplicate.
Version 2.0.5 -
---------------------------
- (&$(@#&$(@!!!! Some "print" statements snuck into pyparsing \
v2.0.4,
breaking Python 3 compatibility! Fixed. Reported by jenshn, thanks!
Version 2.0.4 -
---------------------------
- Added ParserElement.addCondition, to simplify adding parse actions
that act primarily as filters. If the given condition evaluates False,
pyparsing will raise a ParseException. The condition should be a method
with the same method signature as a parse action, but should return a
boolean. Suggested by Victor Porton, nice idea Victor, thanks!
- Slight mod to srange to accept unicode literals for the input string,
such as "[а-ÑÐ-Я]" instead of \
"[\u0430-\u044f\u0410-\u042f]". Thanks
to Alexandr Suchkov for the patch!
- Enhanced implementation of replaceWith.
- Fixed enhanced ParseResults.dump() method when the results consists
only of an unnamed array of sub-structure results. Reported by Robin
Siebler, thanks for your patience and persistence, Robin!
- Fixed bug in fourFn.py example code, where pi and e were defined using
CaselessLiteral instead of CaselessKeyword. This was not a problem until
adding a new function 'exp', and the leading 'e' of 'exp' was accidentally
parsed as the mathematical constant 'e'. Nice catch, Tom Grydeland - thanks!
- Adopt new-fangled Python features, like decorators and ternary expressions,
per suggestions from Williamzjc - thanks William! (Oh yeah, I'm not
supporting Python 2.3 with this code any more...) Plus, some additional
code fixes/cleanup - thanks again!
- Added ParserElement.runTests, a little test bench for quickly running
an expression against a list of sample input strings. Basically, I got
tired of writing the same test code over and over, and finally added it
as a test point method on ParserElement.
- Added withClass helper method, a simplified version of withAttribute for
the common but annoying case when defining a filter on a div's class -
made difficult because 'class' is a Python reserved word.
Version 2.0.3 -
---------------------------
- Fixed escaping behavior in QuotedString. Formerly, only quotation
marks (or characters designated as quotation marks in the QuotedString
constructor) would be escaped. Now all escaped characters will be
escaped, and the escaping backslashes will be removed.
- Fixed regression in ParseResults.pop() - pop() was pretty much
broken after I added *improvements* in 2.0.2. Reported by Iain
Shelvington, thanks Iain!
- Fixed bug in And class when initializing using a generator.
- Enhanced ParseResults.dump() method to list out nested ParseResults that
are unnamed arrays of sub-structures.
- Fixed UnboundLocalError under Python 3.4 in oneOf method, reported
on Sourceforge by aldanor, thanks!
- Fixed bug in ParseResults __init__ method, when returning non-ParseResults
types from parse actions that implement __eq__. Raised during discussion
on the pyparsing wiki with cyrfer.
|
Log message:
==========
Change Log
==========
Version 2.0.2 - April, 2014
---------------------------
- Extended "expr(name)" shortcut (same as \
"expr.setResultsName(name)")
to accept "expr()" as a shortcut for "expr.copy()".
- Added "locatedExpr(expr)" helper, to decorate any returned tokens
with their location within the input string. Adds the results names
locn_start and locn_end to the output parse results.
- Added "pprint()" method to ParseResults, to simplify troubleshooting
and prettified output. Now instead of importing the pprint module
and then writing "pprint.pprint(result)", you can just write
"result.pprint()". This method also accepts addtional positional and
keyword arguments (such as indent, width, etc.), which get passed
through directly to the pprint method
(see http://docs.python.org/2/library/pprint.html#pprint.pprint).
- Removed deprecation warnings when using '<<' for Forward expression
assignment. '<<=' is still preferred, but '<<' will be retained
for cases whre '<<=' operator is not suitable (such as in defining
lambda expressions).
- Expanded argument compatibility for classes and functions that
take list arguments, to now accept generators as well.
- Extended list-like behavior of ParseResults, adding support for
append and extend. NOTE: if you have existing applications using
these names as results names, you will have to access them using
dict-style syntax: res["append"] and res["extend"]
- ParseResults emulates the change in list vs. iterator semantics for
methods like keys(), values(), and items(). Under Python 2.x, these
methods will return lists, under Python 3.x, these methods will
return iterators.
- ParseResults now has a method haskeys() which returns True or False
depending on whether any results names have been defined. This simplifies
testing for the existence of results names under Python 3.x, which
returns keys() as an iterator, not a list.
- ParseResults now supports both list and dict semantics for pop().
If passed no argument or an integer argument, it will use list semantics
and pop tokens from the list of parsed tokens. If passed a non-integer
argument (most likely a string), it will use dict semantics and
pop the corresponding value from any defined results names. A
second default return value argument is supported, just as in
dict.pop().
- Fixed bug in markInputline, thanks for reporting this, Matt Grant!
- Cleaned up my unit test environment, now runs with Python 2.6 and
3.3.
--------------------------
|