./textproc/py-lxml, Python binding for libxml2 and libxslt

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 4.0.0, Package name: py27-lxml-4.0.0, Maintainer: pkgsrc-users

lxml is a Pythonic binding for the libxml2 and libxslt libraries.
It is unique in that it combines the speed and feature completeness
of these libraries with the simplicity of a native Python API,
mostly compatible but superior to the well-known ElementTree API.


Required to run:
[textproc/libxml2] [textproc/libxslt] [devel/py-setuptools] [devel/py-cython] [lang/python27]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: 6f991c9649bfe58527516bbe287b8ddc1d4d3a12
RMD160: c42cfcfe4a337eb3e9bf5d446b635e96153cd8d7
Filesize: 4118.458 KB

Version history: (Expand)


CVS history: (Expand)


   2017-09-19 13:01:45 by Thomas Klausner | Files touched by this commit (1)
Log message:
py-lxml: remove comment about (fixed) test failure
   2017-09-18 13:59:12 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-lxml: update to 4.0.0

4.0.0:

Features added
--------------
* The ElementPath implementation is now compiled using Cython,
  which speeds up the ``.find*()`` methods quite significantly.

* The modules ``lxml.builder``, ``lxml.html.diff`` and ``lxml.html.clean``
  are also compiled using Cython in order to speed them up.

* ``xmlfile()`` supports async coroutines using ``async with`` and ``await``.

* ``iterwalk()`` has a new method ``skip_subtree()`` that prevents walking into
  the descendants of the current element.

* ``RelaxNG.from_rnc_string()`` accepts a ``base_url`` argument to
  allow relative resource lookups.

* The XSLT result object has a new method ``.write_output(file)`` that serialises
  output data into a file according to the ``<xsl:output>`` configuration.

Bugs fixed
----------
* GH-251: HTML comments were handled incorrectly by the soupparser.
  Patch by mozbugbox.

* LP-1654544: The html5parser no longer passes the ``useChardet`` option
  if the input is a Unicode string, unless explicitly requested.  When parsing
  files, the default is to enable it when a URL or file path is passed (because
  the file is then opened in binary mode), and to disable it when reading from
  a file(-like) object.

  Note: This is a backwards incompatible change of the default configuration.
  If your code parses byte strings/streams and depends on character detection,
  please pass the option ``guess_charset=True`` explicitly, which already worked
  in older lxml versions.

* LP-1703810: ``etree.fromstring()`` failed to parse UTF-32 data with BOM.

* LP-1526522: Some RelaxNG errors were not reported in the error log.

* LP-1567526: Empty and plain text input raised a TypeError in soupparser.

* LP-1710429: Uninitialised variable usage in HTML diff.

* LP-1415643: The closing tags context manager in ``xmlfile()`` could continue
  to output end tags even after writing failed with an exception.

* LP-1465357: ``xmlfile.write()`` now accepts and ignores None as input argument.

* Compilation under Py3.7-pre failed due to a modified function signature.

Other changes
-------------
* The main module source files were renamed from ``lxml.*.pyx`` to plain
  ``*.pyx`` (e.g. ``etree.pyx``) to simplify their handling in the build
  process.  Care was taken to keep the old header files as fallbacks for
  code that compiles against the public C-API of lxml, but it might still
  be worth validating that third-party code does not notice this change.
   2017-08-12 21:17:50 by Thomas Klausner | Files touched by this commit (1) | Package updated
Log message:
Update self test bug status.
   2017-06-04 21:17:51 by Adam Ciarcinski | Files touched by this commit (2)
Log message:
3.8.0 (2017-06-03)

Features added
--------------
* ``ElementTree.write()`` has a new option ``doctype`` that writes out a
  doctype string before the serialisation, in the same way as ``tostring()``.

* GH-220: ``xmlfile`` allows switching output methods at an element level.
  Patch by Burak Arslan.

* LP-1595781, GH-240: added a PyCapsule Python API and C-level API for
  passing externally generated libxml2 documents into lxml.

* GH-244: error log entries have a new property ``path`` with an XPath
  expression (if known, None otherwise) that points to the tree element
  responsible for the error. Patch by Bob Kline.

* The namespace prefix mapping that can be used in ElementPath now injects
  a default namespace when passing a None prefix.

Bugs fixed
----------
* GH-238: Character escapes were not hex-encoded in the ``xmlfile`` serialiser.
  Patch by matejcik.

* GH-229: fix for externally created XML documents.  Patch by Theodore Dubois.

* LP-1665241, GH-228: Form data handling in lxml.html no longer strips the
  option values specified in form attributes but only the text values.
  Patch by Ashish Kulkarni.

* LP-1551797: revert previous fix for XSLT error logging as it breaks
  multi-threaded XSLT processing.

* LP-1673355, GH-233: ``fromstring()`` html5parser failed to parse byte strings.

Other changes
-------------
* The previously undocumented ``docstring`` option in ``ElementTree.write()``
  produces a deprecation warning and will eventually be removed.
   2017-05-04 23:19:29 by Adam Ciarcinski | Files touched by this commit (2)
Log message:
Changes 3.7.3:
Bugs fixed
* GH-218 was ineffective in Python 3.
* GH-222: lxml.html.submit_form() failed in Python 3.
   2017-01-17 13:58:29 by Thomas Klausner | Files touched by this commit (1)
Log message:
Fix typo.
   2017-01-17 12:10:13 by Thomas Klausner | Files touched by this commit (1)
Log message:
Add another bug report for failing tests.
   2017-01-16 12:07:12 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
Updated py-lxml to 3.7.2.

==============
lxml changelog
==============

3.7.2 (2017-01-08)
==================

Bugs fixed
----------

* Work around installation problems in recent Python 2.7 versions
  due to FTP download failures.

* GH#219: ``xmlfile.element()`` was not properly quoting attribute values.
  Patch by Burak Arslan.

* GH#218: ``xmlfile.element()`` was not properly escaping text content of
  script/style tags.  Patch by Burak Arslan.