./www/py-parsel, Library to extract data from HTML and XML using XPath and CSS

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.8.1, Package name: py311-parsel-1.8.1, Maintainer: pkgsrc-users

Parsel is a library to extract data from HTML and XML using XPath and CSS
selectors.

Features:
* Extract text using CSS or XPath selectors
* Regular expression helper methods


Required to run:
[devel/py-setuptools] [textproc/py-lxml] [textproc/py-cssselect] [lang/py-six] [www/py-w3lib] [lang/python37]

Required to build:
[pkgtools/cwrappers] [devel/py-test-runner]

Master sites:

Filesize: 49.688 KB

Version history: (Expand)


CVS history: (Expand)


   2023-11-01 19:27:13 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-parsel: updated to 1.8.1

1.8.1 (2023-04-18)
~~~~~~~~~~~~~~~~~~

* Remove a Sphinx reference from NEWS to fix the PyPI description
* Add a ``twine check`` CI check to detect such problems

1.8.0 (2023-04-18)
~~~~~~~~~~~~~~~~~~

* Add support for JMESPath: you can now create a selector for a JSON document
  and call ``Selector.jmespath()``. See `the documentation`_ for more
  information and examples.
* Selectors can now be constructed from ``bytes`` (using the ``body`` and
  ``encoding`` arguments) instead of ``str`` (using the ``text`` argument), so
  that there is no internal conversion from ``str`` to ``bytes`` and the memory
  usage is lower.
* Typing improvements
* The ``pkg_resources`` module (which was absent from the requirements) is no
  longer used
* Documentation build fixes
* New requirements:

  * ``jmespath``
  * ``typing_extensions`` (on Python 3.7)
   2023-01-11 12:47:18 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-parsel: updated to 1.7.0

1.7.0 (2022-11-01)
* Add PEP 561-style type information
* Support for Python 2.7, 3.5 and 3.6 is removed
* Support for Python 3.9-3.11 is added
* Very large documents (with deep nesting or long tag content) can now be
  parsed, and ``Selector`` now takes a new argument ``huge_tree`` to disable
  this
* Support for new features of cssselect 1.2.0 is added
* The ``Selector.remove()`` and ``SelectorList.remove()`` methods are
  deprecated and replaced with the new ``Selector.drop()`` and
  ``SelectorList.drop()`` methods which don't delete text after the dropped
  elements when used in the HTML mode.
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message:
*: bump PKGREVISION for egg.mk users

They now have a tool dependency on py-setuptools instead of a DEPENDS
   2021-10-26 13:31:15 by Nia Alarie | Files touched by this commit (1030)
Log message:
www: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Not committed (merge conflicts):
www/nghttp2/distinfo

Unfetchable distfiles (almost certainly fetched conditionally...):
./www/nginx-devel/distinfo array-var-nginx-module-0.05.tar.gz
./www/nginx-devel/distinfo echo-nginx-module-0.62.tar.gz
./www/nginx-devel/distinfo encrypted-session-nginx-module-0.08.tar.gz
./www/nginx-devel/distinfo form-input-nginx-module-0.12.tar.gz
./www/nginx-devel/distinfo headers-more-nginx-module-0.33.tar.gz
./www/nginx-devel/distinfo lua-nginx-module-0.10.19.tar.gz
./www/nginx-devel/distinfo naxsi-1.3.tar.gz
./www/nginx-devel/distinfo nginx-dav-ext-module-3.0.0.tar.gz
./www/nginx-devel/distinfo nginx-rtmp-module-1.2.2.tar.gz
./www/nginx-devel/distinfo nginx_http_push_module-1.2.10.tar.gz
./www/nginx-devel/distinfo ngx_cache_purge-2.5.1.tar.gz
./www/nginx-devel/distinfo ngx_devel_kit-0.3.1.tar.gz
./www/nginx-devel/distinfo ngx_http_geoip2_module-3.3.tar.gz
./www/nginx-devel/distinfo njs-0.5.0.tar.gz
./www/nginx-devel/distinfo set-misc-nginx-module-0.32.tar.gz
./www/nginx/distinfo array-var-nginx-module-0.05.tar.gz
./www/nginx/distinfo echo-nginx-module-0.62.tar.gz
./www/nginx/distinfo encrypted-session-nginx-module-0.08.tar.gz
./www/nginx/distinfo form-input-nginx-module-0.12.tar.gz
./www/nginx/distinfo headers-more-nginx-module-0.33.tar.gz
./www/nginx/distinfo lua-nginx-module-0.10.19.tar.gz
./www/nginx/distinfo naxsi-1.3.tar.gz
./www/nginx/distinfo nginx-dav-ext-module-3.0.0.tar.gz
./www/nginx/distinfo nginx-rtmp-module-1.2.2.tar.gz
./www/nginx/distinfo nginx_http_push_module-1.2.10.tar.gz
./www/nginx/distinfo ngx_cache_purge-2.5.1.tar.gz
./www/nginx/distinfo ngx_devel_kit-0.3.1.tar.gz
./www/nginx/distinfo ngx_http_geoip2_module-3.3.tar.gz
./www/nginx/distinfo njs-0.5.0.tar.gz
./www/nginx/distinfo set-misc-nginx-module-0.32.tar.gz
   2021-10-07 17:09:00 by Nia Alarie | Files touched by this commit (1033)
Log message:
www: Remove SHA1 hashes for distfiles
   2020-05-17 22:37:20 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-parsel: updated to 1.6.0

1.6.0:
* Python 3.4 is no longer supported
* New ``Selector.remove()`` and ``SelectorList.remove()`` methods to remove
  selected elements from the parsed document tree
* Improvements to error reporting, test coverage and documentation, and code
  cleanup
   2019-08-12 22:04:22 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-parsel: updated to 1.5.2

1.5.2:
* Selector.remove_namespaces received a significant performance improvement
* The value of data within the printable representation of a selector
  (repr(selector)) now ends in ... when truncated, to make the
  truncation obvious.
* Minor documentation improvements.
   2018-11-15 10:53:33 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-parsel: updated to 1.5.1

1.5.1:
* has-class XPath function handles newlines and other separators
  in class names properly;
* fixed parsing of HTML documents with null bytes;
* documentation improvements;
* Python 3.7 tests are run on CI; other test improvements.