./textproc/py-lxml, Python binding for libxml2 and libxslt

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 4.2.1, Package name: py27-lxml-4.2.1, Maintainer: pkgsrc-users

lxml is a Pythonic binding for the libxml2 and libxslt libraries.
It is unique in that it combines the speed and feature completeness
of these libraries with the simplicity of a native Python API,
mostly compatible but superior to the well-known ElementTree API.

Required to run:
[textproc/libxml2] [textproc/libxslt] [devel/py-setuptools] [devel/py-cython] [lang/python27]

Required to build:

Master sites:

SHA1: 5ac888d5957f74298fb6daf74778bd91812f7571
RMD160: 9dd038937c8579c0bfa6bf95b845e4945f31c5d0
Filesize: 4183.854 KB

Version history: (Expand)

CVS history: (Expand)

   2018-03-22 08:56:35 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-lxml: updated to 4.2.1

Bugs fixed
* iterwalk() failed to return the ‘start’ event for the initial element if a \ 
tag selector is used.
* Failure to import 4.2.0 into PyPy due to a missing library symbol.
* Add “-isysroot” linker option on MacOS as needed by XCode 9.
   2018-03-15 09:38:17 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-lxml: updated to 4.2.0

Features added
* SelectElement.value returns more standard-compliant and browser-like defaults \ 
for non-multi-selects. If no option is selected, the value of the first option \ 
is returned (instead of None). If multiple options are selected, the value of \ 
the last one is returned (instead of that of the first one). If no options are \ 
present (not standard-compliant) SelectElement.value still returns None.
* The HTMLParser() now supports the huge_tree option. Patch by stranac.

Bugs fixed
* Some XSLT messages were not captured by the transform error log.
* Crash at shutdown after an interrupted iterparse run with XMLSchema validation.
   2017-11-06 11:14:28 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-lxml: updated to 4.1.1

* Rebuild with Cython 0.27.3 to improve support for Py3.7.
   2017-10-14 12:14:26 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-lxml: update to 4.1.0

Features added
* ElementPath supports text predicates for current node, like “[.=’text’]”.
* ElementPath allows spaces in predicates.
* Custom Element classes and XPath functions can now be registered with a \ 
decorator rather than explicit dict assignments.
* Static Linux wheels are now built with link time optimisation (LTO) enabled. \ 
This should have a beneficial impact on the overall performance by providing a \ 
tighter compiler integration between lxml and libxml2/libxslt.

Bugs fixed
* Requesting non-Element objects like comments from a document with \ 
PythonElementClassLookup could fail with a TypeError.
   2017-09-19 13:01:45 by Thomas Klausner | Files touched by this commit (1)
Log message:
py-lxml: remove comment about (fixed) test failure
   2017-09-18 13:59:12 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-lxml: update to 4.0.0


Features added
* The ElementPath implementation is now compiled using Cython,
  which speeds up the ``.find*()`` methods quite significantly.

* The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean``
  are also compiled using Cython in order to speed them up.

* ``xmlfile()`` supports async coroutines using ``async with`` and ``await``.

* ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into
  the descendants of the current element.

* ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to
  allow relative resource lookups.

* The XSLT result object has a new method ``.write_output(file)`` that serialises
  output data into a file according to the ``<xsl:output>`` configuration.

Bugs fixed
* GH-251: HTML comments were handled incorrectly by the soupparser.
  Patch by mozbugbox.

* LP-1654544: The html5parser no longer passes the ``useChardet`` option
  if the input is a Unicode string, unless explicitly requested.  When parsing
  files, the default is to enable it when a URL or file path is passed (because
  the file is then opened in binary mode), and to disable it when reading from
  a file(-like) object.

  Note: This is a backwards incompatible change of the default configuration.
  If your code parses byte strings/streams and depends on character detection,
  please pass the option ``guess_charset=True`` explicitly, which already worked
  in older lxml versions.

* LP-1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM.

* LP-1526522: Some RelaxNG errors were not reported in the error log.

* LP-1567526: Empty and plain text input raised a TypeError in soupparser.

* LP-1710429: Uninitialised variable usage in HTML diff.

* LP-1415643: The closing tags context manager in ``xmlfile()`` could continue
  to output end tags even after writing failed with an exception.

* LP-1465357: ``xmlfile.write()`` now accepts and ignores None as input argument.

* Compilation under Py3.7-pre failed due to a modified function signature.

Other changes
* The main module source files were renamed from ``lxml.*.pyx`` to plain
  ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build
  process.  Care was taken to keep the old header files as fallbacks for
  code that compiles against the public C-API of lxml, but it might still
  be worth validating that third-party code does not notice this change.
   2017-08-12 21:17:50 by Thomas Klausner | Files touched by this commit (1) | Package updated
Log message:
Update self test bug status.
   2017-06-04 21:17:51 by Adam Ciarcinski | Files touched by this commit (2)
Log message:
3.8.0 (2017-06-03)

Features added
* ``ElementTree.write()`` has a new option ``doctype`` that writes out a
  doctype string before the serialisation, in the same way as ``tostring()``.

* GH-220: ``xmlfile`` allows switching output methods at an element level.
  Patch by Burak Arslan.

* LP-1595781, GH-240: added a PyCapsule Python API and C-level API for
  passing externally generated libxml2 documents into lxml.

* GH-244: error log entries have a new property ``path`` with an XPath
  expression (if known, None otherwise) that points to the tree element
  responsible for the error. Patch by Bob Kline.

* The namespace prefix mapping that can be used in ElementPath now injects
  a default namespace when passing a None prefix.

Bugs fixed
* GH-238: Character escapes were not hex-encoded in the ``xmlfile`` serialiser.
  Patch by matejcik.

* GH-229: fix for externally created XML documents.  Patch by Theodore Dubois.

* LP-1665241, GH-228: Form data handling in lxml.html no longer strips the
  option values specified in form attributes but only the text values.
  Patch by Ashish Kulkarni.

* LP-1551797: revert previous fix for XSLT error logging as it breaks
  multi-threaded XSLT processing.

* LP-1673355, GH-233: ``fromstring()`` html5parser failed to parse byte strings.

Other changes
* The previously undocumented ``docstring`` option in ``ElementTree.write()``
  produces a deprecation warning and will eventually be removed.