Path to this page:
Subject: CVS commit: pkgsrc/www/py-scrapy
From: Adam Ciarcinski
Date: 2023-05-10 14:40:45
Message id: 20230510124045.19EFCFA87@cvs.NetBSD.org
Log Message:
py-scrapy: updated to 2.9.0
Scrapy 2.9.0 (2023-05-08)
-------------------------
Highlights:
- Per-domain download settings.
- Compatibility with new cryptography_ and new parsel_.
- JMESPath selectors from the new parsel_.
- Bug fixes.
Deprecations
~~~~~~~~~~~~
- :class:`scrapy.extensions.feedexport._FeedSlot` is renamed to
:class:`scrapy.extensions.feedexport.FeedSlot` and the old name is
deprecated. (:issue:`5876`)
New features
~~~~~~~~~~~~
- Settings correponding to :setting:`DOWNLOAD_DELAY`,
:setting:`CONCURRENT_REQUESTS_PER_DOMAIN` and
:setting:`RANDOMIZE_DOWNLOAD_DELAY` can now be set on a per-domain basis
via the new :setting:`DOWNLOAD_SLOTS` setting. (:issue:`5328`)
- Added :meth:`TextResponse.jmespath`, a shortcut for JMESPath selectors
available since parsel_ 1.8.1. (:issue:`5894`, :issue:`5915`)
- Added :signal:`feed_slot_closed` and :signal:`feed_exporter_closed`
signals. (:issue:`5876`)
- Added :func:`scrapy.utils.request.request_to_curl`, a function to produce a
curl command from a :class:`~scrapy.Request` object. (:issue:`5892`)
- Values of :setting:`FILES_STORE` and :setting:`IMAGES_STORE` can now be
:class:`pathlib.Path` instances. (:issue:`5801`)
Bug fixes
~~~~~~~~~
- Fixed a warning with Parsel 1.8.1+. (:issue:`5903`, :issue:`5918`)
- Fixed an error when using feed postprocessing with S3 storage.
(:issue:`5500`, :issue:`5581`)
- Added the missing :meth:`scrapy.settings.BaseSettings.setdefault` method.
(:issue:`5811`, :issue:`5821`)
- Fixed an error when using cryptography_ 40.0.0+ and
:setting:`DOWNLOADER_CLIENT_TLS_VERBOSE_LOGGING` is enabled.
(:issue:`5857`, :issue:`5858`)
- The checksums returned by :class:`~scrapy.pipelines.files.FilesPipeline`
for files on Google Cloud Storage are no longer Base64-encoded.
(:issue:`5874`, :issue:`5891`)
- :func:`scrapy.utils.request.request_from_curl` now supports $-prefixed
string values for the curl ``--data-raw`` argument, which are produced by
browsers for data that includes certain symbols. (:issue:`5899`,
:issue:`5901`)
- The :command:`parse` command now also works with async generator callbacks.
(:issue:`5819`, :issue:`5824`)
- The :command:`genspider` command now properly works with HTTPS URLs.
(:issue:`3553`, :issue:`5808`)
- Improved handling of asyncio loops. (:issue:`5831`, :issue:`5832`)
- :class:`LinkExtractor <scrapy.linkextractors.lxmlhtml.LxmlLinkExtractor>`
now skips certain malformed URLs instead of raising an exception.
(:issue:`5881`)
- :func:`scrapy.utils.python.get_func_args` now supports more types of
callables. (:issue:`5872`, :issue:`5885`)
- Fixed an error when processing non-UTF8 values of ``Content-Type`` headers.
(:issue:`5914`, :issue:`5917`)
- Fixed an error breaking user handling of send failures in
:meth:`scrapy.mail.MailSender.send()`. (:issue:`1611`, :issue:`5880`)
Documentation
~~~~~~~~~~~~~
- Expanded contributing docs. (:issue:`5109`, :issue:`5851`)
- Added blacken-docs_ to pre-commit and reformatted the docs with it.
(:issue:`5813`, :issue:`5816`)
- Fixed a JS issue. (:issue:`5875`, :issue:`5877`)
- Fixed ``make htmlview``. (:issue:`5878`, :issue:`5879`)
- Fixed typos and other small errors. (:issue:`5827`, :issue:`5839`,
:issue:`5883`, :issue:`5890`, :issue:`5895`, :issue:`5904`)
Quality assurance
~~~~~~~~~~~~~~~~~
- Extended typing hints. (:issue:`5805`, :issue:`5889`, :issue:`5896`)
- Tests for most of the examples in the docs are now run as a part of CI,
found problems were fixed. (:issue:`5816`, :issue:`5826`, :issue:`5919`)
- Removed usage of deprecated Python classes. (:issue:`5849`)
- Silenced ``include-ignored`` warnings from coverage. (:issue:`5820`)
- Fixed a random failure of the ``test_feedexport.test_batch_path_differ``
test. (:issue:`5855`, :issue:`5898`)
- Updated docstrings to match output produced by parsel_ 1.8.1 so that they
don't cause test failures. (:issue:`5902`, :issue:`5919`)
- Other CI and pre-commit improvements. (:issue:`5802`, :issue:`5823`,
:issue:`5908`)
Files: