./devel/py-pyparsing, Parsing module for Python

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 2.4.0, Package name: py27-pyparsing-2.4.0, Maintainer: pkgsrc-users

The pyparsing module is an alternative approach to creating and executing
simple grammars, vs. the traditional lex/yacc approach, or the use of regular
expressions. The pyparsing module provides a library of classes that client
code uses to construct the grammar directly in Python code.


Required to run:
[devel/py-setuptools] [lang/python27]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: dfddeaf3e8973875693d1a40ccee87f39dd120bd
RMD160: 6d139406b6b5bee25d04df435c8b42c0dabf10ea
Filesize: 597.613 KB

Version history: (Expand)


CVS history: (Expand)


   2019-04-08 12:41:05 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pyparsing: updated to 2.4.0

Version 2.4.0:

- Well, it looks like the API change that was introduced in 2.3.1 was more
  drastic than expected, so for a friendlier forward upgrade path, this
  release:
  . Bumps the current version number to 2.4.0, to reflect this
    incompatible change.
  . Adds a pyparsing.__compat__ object for specifying compatibility with
    future breaking changes.
  . Conditionalizes the API-breaking behavior, based on the value
    pyparsing.__compat__.collect_all_And_tokens.  By default, this value
    will be set to True, reflecting the new bugfixed behavior. To set this
    value to False, add to your code:

        import pyparsing
        pyparsing.__compat__.collect_all_And_tokens = False

  . User code that is dependent on the pre-bugfix behavior can restore
    it by setting this value to False.

  In 2.5 and later versions, the conditional code will be removed and
  setting the flag to True or False in these later versions will have no
  effect.

- Updated unitTests.py and simple_unit_tests.py to be compatible with
  "python setup.py test". To run tests using setup, do:

      python setup.py test
      python setup.py test -s unitTests.suite
      python setup.py test -s simple_unit_tests.suite

- Fixed bug in runTests handling '\n' literals in quoted strings.

- Added tag_body attribute to the start tag expressions generated by
  makeHTMLTags, so that you can avoid using SkipTo to roll your own
  tag body expression:

      a, aEnd = pp.makeHTMLTags('a')
      link = a + a.tag_body("displayed_text") + aEnd
      for t in s.searchString(html_page):
          print(t.displayed_text, '->', t.startA.href)

- indentedBlock failure handling was improved

- Address Py2 incompatibility in simpleUnitTests, plus explain() and
  Forward str() cleanup.

- Fixed docstring with embedded '\w', which creates SyntaxWarnings in
  Py3.8.

- Examples:
  - Added example parser for rosettacode.org tutorial compiler.

  - Added example to show how an HTML table can be parsed into a
    collection of Python lists or dicts, one per row.

  - Updated SimpleSQL.py example to handle nested selects, reworked
    'where' expression to use infixNotation.

  - Added include_preprocessor.py, similar to macroExpander.py.

  - Examples using makeHTMLTags use new tag_body expression when
    retrieving a tag's body text.

  - Updated examples that are runnable as unit tests:
        python setup.py test -s examples.antlr_grammar_tests
        python setup.py test -s examples.test_bibparse
   2019-01-15 12:37:21 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pyparsing: updated to 2.3.1

Version 2.3.1
-------------
- POSSIBLE API CHANGE: this release fixes a bug when results names were
  attached to a MatchFirst or Or object containing an And object.
  Previously, a results name on an And object within an enclosing MatchFirst
  or Or could return just the first token in the And. Now, all the tokens
  matched by the And are correctly returned. This may result in subtle
  changes in the tokens returned if you have this condition in your pyparsing
  scripts.

- New staticmethod ParseException.explain() to help diagnose parse exceptions
  by showing the failing input line and the trace of ParserElements in
  the parser leading up to the exception. explain() returns a multiline
  string listing each element by name. (This is still an experimental
  method, and the method signature and format of the returned string may
  evolve over the next few releases.)

  Example:
        # define a parser to parse an integer followed by an
        # alphabetic word
        expr = pp.Word(pp.nums).setName("int")
               + pp.Word(pp.alphas).setName("word")
        try:
            # parse a string with a numeric second value instead of alpha
            expr.parseString("123 355")
        except pp.ParseException as pe:
            print_(pp.ParseException.explain(pe))

  Prints:
        123 355
            ^
        ParseException: Expected word (at char 4), (line:1, col:5)
        __main__.ExplainExceptionTest
        pyparsing.And - {int word}
        pyparsing.Word - word

  explain() will accept any exception type and will list the function
  names and parse expressions in the stack trace. This is especially
  useful when an exception is raised in a parse action.

  Note: explain() is only supported under Python 3.

- Fix bug in dictOf which could match an empty sequence, making it
  infinitely loop if wrapped in a OneOrMore.

- Added unicode sets to pyparsing_unicode for Latin-A and Latin-B ranges.

- Added ability to define custom unicode sets as combinations of other sets
  using multiple inheritance.

    class Turkish_set(pp.pyparsing_unicode.Latin1, pp.pyparsing_unicode.LatinA):
        pass

    turkish_word = pp.Word(Turkish_set.alphas)

- Updated state machine import examples, with state machine demos for:
  . traffic light
  . library book checkin/checkout
  . document review/approval

  In the traffic light example, you can use the custom 'statemachine' keyword
  to define the states for a traffic light, and have the state classes
  auto-generated for you:

      statemachine TrafficLightState:
          Red -> Green
          Green -> Yellow
          Yellow -> Red

  Similar for state machines with named transitions, like the library book
  state example:

      statemachine LibraryBookState:
          New -(shelve)-> Available
          Available -(reserve)-> OnHold
          OnHold -(release)-> Available
          Available -(checkout)-> CheckedOut
          CheckedOut -(checkin)-> Available

  Once the classes are defined, then additional Python code can reference those
  classes to add class attributes, instance methods, etc.

  See the examples in examples/statemachine

- Added an example parser for the decaf language. This language is used in
  CS compiler classes in many colleges and universities.

- Fixup of docstrings to Sphinx format, inclusion of test files in the source
  package, and convert markdown to rst throughout the distribution, great job
  by Matěj Cepl!

- Expanded the whitespace characters recognized by the White class to include
  all unicode defined spaces.

- Added optional postParse argument to ParserElement.runTests() to add a
  custom callback to be called for test strings that parse successfully. Useful
  for running tests that do additional validation or processing on the parsed
  results. See updated chemicalFormulas.py example.

- Removed distutils fallback in setup.py. If installing the package fails,
  please update to the latest version of setuptools. Plus overall project code
  cleanup (CRLFs, whitespace, imports, etc.), thanks Jon Dufresne!

- Fix bug in CaselessKeyword, to make its behavior consistent with
  Keyword(caseless=True).
   2018-12-10 16:18:41 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pyparsing: 2.3.0

Version 2.3.0:
--------------
- NEW SUPPORT FOR UNICODE CHARACTER RANGES
  This release introduces the pyparsing_unicode namespace class, defining
  a series of language character sets to simplify the definition of alphas,
  nums, alphanums, and printables in the following language sets:
   . Arabic
   . Chinese
   . Cyrillic
   . Devanagari
   . Greek
   . Hebrew
   . Japanese (including Kanji, Katakana, and Hirigana subsets)
   . Korean
   . Latin1 (includes 7 and 8-bit Latin characters)
   . Thai
   . CJK (combination of Chinese, Japanese, and Korean sets)

  For example, your code can define words using:

    korean_word = Word(pyparsing_unicode.Korean.alphas)

  See their use in the updated examples greetingInGreek.py and
  greetingInKorean.py.

  This namespace class also offers access to these sets using their
  unicode identifiers.

- POSSIBLE API CHANGE: Fixed bug where a parse action that explicitly
  returned the input ParseResults could add another nesting level in
  the results if the current expression had a results name.

        vals = pp.OneOrMore(pp.pyparsing_common.integer)("int_values")

        def add_total(tokens):
            tokens['total'] = sum(tokens)
            return tokens  # this line can be removed

        vals.addParseAction(add_total)
        print(vals.parseString("244 23 13 2343").dump())

  Before the fix, this code would print (note the extra nesting level):

    [244, 23, 13, 2343]
    - int_values: [244, 23, 13, 2343]
      - int_values: [244, 23, 13, 2343]
      - total: 2623
    - total: 2623

  With the fix, this code now prints:

    [244, 23, 13, 2343]
    - int_values: [244, 23, 13, 2343]
    - total: 2623

  This fix will change the structure of ParseResults returned if a
  program defines a parse action that returns the tokens that were
  sent in. This is not necessary, and statements like "return tokens"
  in the example above can be safely deleted prior to upgrading to
  this release, in order to avoid the bug and get the new behavior.

  Reported by seron in Issue 22, nice catch!

- POSSIBLE API CHANGE: Fixed a related bug where a results name
  erroneously created a second level of hierarchy in the returned
  ParseResults. The intent for accumulating results names into ParseResults
  is that, in the absence of Group'ing, all names get merged into a
  common namespace. This allows us to write:

       key_value_expr = (Word(alphas)("key") + '=' + \ 
Word(nums)("value"))
       result = key_value_expr.parseString("a = 100")

  and have result structured as {"key": "a", \ 
"value": "100"}
  instead of [{"key": "a"}, {"value": \ 
"100"}].

  However, if a named expression is used in a higher-level non-Group
  expression that *also* has a name, a false sub-level would be created
  in the namespace:

        num = pp.Word(pp.nums)
        num_pair = ("[" + (num("A") + \ 
num("B"))("values") + "]")
        U = num_pair.parseString("[ 10 20 ]")
        print(U.dump())

  Since there is no grouping, "A", "B", and \ 
"values" should all appear
  at the same level in the results, as:

        ['[', '10', '20', ']']
        - A: '10'
        - B: '20'
        - values: ['10', '20']

  Instead, an extra level of "A" and "B" show up under \ 
"values":

        ['[', '10', '20', ']']
        - A: '10'
        - B: '20'
        - values: ['10', '20']
          - A: '10'
          - B: '20'

  This bug has been fixed. Now, if this hierarchy is desired, then a
  Group should be added:

        num_pair = ("[" + pp.Group(num("A") + \ 
num("B"))("values") + "]")

  Giving:

        ['[', ['10', '20'], ']']
        - values: ['10', '20']
          - A: '10'
          - B: '20'

  But in no case should "A" and "B" appear in multiple \ 
levels. This bug-fix
  fixes that.

  If you have current code which relies on this behavior, then add or remove
  Groups as necessary to get your intended results structure.

  Reported by Athanasios Anastasiou.

- IndexError's raised in parse actions will get explicitly reraised
  as ParseExceptions that wrap the original IndexError. Since
  IndexError sometimes occurs as part of pyparsing's normal parsing
  logic, IndexErrors that are raised during a parse action may have
  gotten silently reinterpreted as parsing errors. To retain the
  information from the IndexError, these exceptions will now be
  raised as ParseExceptions that reference the original IndexError.
  This wrapping will only be visible when run under Python3, since it
  emulates "raise ... from ..." syntax.

  Addresses Issue 4, reported by guswns0528.

- Added Char class to simplify defining expressions of a single
  character. (Char("abc") is equivalent to Word("abc", exact=1))

- Added class PrecededBy to perform lookbehind tests. PrecededBy is
  used in the same way as FollowedBy, passing in an expression that
  must occur just prior to the current parse location.

  For fixed-length expressions like a Literal, Keyword, Char, or a
  Word with an `exact` or `maxLen` length given, `PrecededBy(expr)`
  is sufficient. For varying length expressions like a Word with no
  given maximum length, `PrecededBy` must be constructed with an
  integer `retreat` argument, as in
  `PrecededBy(Word(alphas, nums), retreat=10)`, to specify the maximum
  number of characters pyparsing must look backward to make a match.
  pyparsing will check all the values from 1 up to retreat characters
  back from the current parse location.

  When stepping backwards through the input string, PrecededBy does
  *not* skip over whitespace.

  PrecededBy can be created with a results name so that, even though
  it always returns an empty parse result, the result *can* include
  named results.

  Idea first suggested in Issue 30 by Freakwill.

- Updated FollowedBy to accept expressions that contain named results,
  so that results names defined in the lookahead expression will be
  returned, even though FollowedBy always returns an empty list.
  Inspired by the same feature implemented in PrecededBy.
   2018-10-03 13:50:46 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-pyparsing: updated to 2.2.2

Version 2.2.2 - September, 2018
-------------------------------
- Fixed bug in SkipTo, if a SkipTo expression that was skipping to
  an expression that returned a list (such as an And), and the
  SkipTo was saved as a named result, the named result could be
  saved as a ParseResults - should always be saved as a string.
  Issue 28, reported by seron.

- Added simple_unit_tests.py, as a collection of easy-to-follow unit
  tests for various classes and features of the pyparsing library.
  Primary intent is more to be instructional than actually rigorous
  testing. Complex tests can still be added in the unitTests.py file.

- New features added to the Regex class:
  - optional asGroupList parameter, returns all the capture groups as
    a list
  - optional asMatch parameter, returns the raw re.match result
  - new sub(repl) method, which adds a parse action calling
    re.sub(pattern, repl, parsed_result). Simplifies creating
    Regex expressions to be used with transformString. Like re.sub,
    repl may be an ordinary string (similar to using pyparsing's
    replaceWith), or may contain references to capture groups by group
    number, or may be a callable that takes an re match group and
    returns a string.

    For instance:
        expr = \ 
pp.Regex(r"([Hh]\d):\s*(.*)").sub(r"<\1>\2</\1>" \ 
)
        expr.transformString("h1: This is the title")

    will return
        <h1>This is the title</h1>

- Fixed omission of LICENSE file in source tarball, also added
  CODE_OF_CONDUCT.md per GitHub community standards.

Version 2.2.1 - September, 2018
-------------------------------
- Applied changes necessary to migrate hosting of pyparsing source
  over to GitHub. Many thanks for help and contributions from hugovk,
  jdufresne, and cngkaygusuz among others through this transition,
  sorry it took me so long!

- Fixed import of collections.abc to address DeprecationWarnings
  in Python 3.7.

- Updated oc.py example to support function calls in arithmetic
  expressions; fixed regex for '==' operator; and added packrat
  parsing. Raised on the pyparsing wiki by Boris Marin, thanks!

- Fixed bug in select_parser.py example, group_by_terms was not
  reported. Reported on SF bugs by Adam Groszer, thanks Adam!

- Added "Getting Started" section to the module docstring, to
  guide new users to the most common starting points in pyparsing's
  API.

- Fixed bug in Literal and Keyword classes, which erroneously
  raised IndexError instead of ParseException.
   2017-06-01 15:31:46 by Thomas Klausner | Files touched by this commit (4)
Log message:
BOOTSTRAP_SETUPTOOLS is not necessary any longer.
Leave it commented out for now.
   2017-04-07 05:35:12 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
Version 2.2.0 - March, 2017
---------------------------
- Bumped minor version number to reflect compatibility issues with
  OneOrMore and ZeroOrMore bugfixes in 2.1.10. (2.1.10 fixed a bug
  that was introduced in 2.1.4, but the fix could break code
  written against 2.1.4 - 2.1.9.)

- Updated setup.py to address recursive import problems now
  that pyparsing is part of 'packaging' (used by setuptools).

- Fixed KeyError issue reported by Yann Bizeul when using packrat
  parsing in the Graphite time series database, thanks Yann!

- Fixed incorrect usages of '\' in literals.

- Minor internal change when using '-' operator, to be compatible
  with ParserElement.streamline() method.

- Expanded infixNotation to accept a list or tuple of parse actions
  to attach to an operation.

- New unit test added for dill support for storing pyparsing parsers.
  Ordinary Python pickle can be used to pickle pyparsing parsers as
  long as they do not use any parse actions. The 'dill' module is an
  extension to pickle which *does* support pickling of attached
  parse actions.
   2017-02-08 13:11:09 by Thomas Klausner | Files touched by this commit (3)
Log message:
Mark setuptools dependencies with BOOTSTRAP_SETUPTOOLS=yes.
   2017-01-25 19:04:24 by Adam Ciarcinski | Files touched by this commit (2)
Log message:
Version 2.1.10 - October, 2016
-------------------------------
- Fixed bug in reporting named parse results for ZeroOrMore
  expressions, thanks Ethan Nash for reporting this!

- Fixed behavior of LineStart to be much more predictable.
  LineStart can now be used to detect if the next parse position
  is col 1, factoring in potential leading whitespace (which would
  cause LineStart to fail). Also fixed a bug in col, which is
  used in LineStart, where '\n's were erroneously considered to
  be column 1.

- Added support for multiline test strings in runTests.

- Fixed bug in ParseResults.dump when keys were not strings.
  Also changed display of string values to show them in quotes,
  to help distinguish parsed numeric strings from parsed integers
  that have been converted to Python ints.