Path to this page:
./
textproc/py-html-sanitizer,
White-list based HTML sanitizer
Branch: CURRENT,
Version: 2.4.4,
Package name: py311-html-sanitizer-2.4.4,
Maintainer: pkgsrc-usershtml-sanitizer is a whitelist-based and very opinionated HTML sanitizer
that can be used both for untrusted and trusted sources. It attempts to
clean up the mess made by various rich text editors and or copy-pasting
to make styling of webpages simpler and more consistent. It builds on the
excellent HTML cleaner in lxml to make the result both valid and safe.
It goes further than pure tag filtering by transforming the HTML
fragments to normalize formatting and drop redundant or pointless tags.
Required to run:[
textproc/py-lxml] [
www/py-beautifulsoup4] [
lang/python310]
Master sites:
Filesize: 16.853 KB
Version history: (Expand)
- (2024-05-27) Updated to version: py311-html-sanitizer-2.4.4
- (2024-03-11) Updated to version: py311-html-sanitizer-2.3.1
- (2024-02-07) Updated to version: py311-html-sanitizer-2.3.0
- (2023-11-27) Updated to version: py311-html-sanitizer-2.2.0
- (2022-11-30) Updated to version: py310-html-sanitizer-1.9.3
- (2022-11-09) Updated to version: py310-html-sanitizer-1.6.4nb1
CVS history: (Expand)
2024-03-11 07:55:41 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-html-sanitizer: updated to 2.3.1
2.3.1
- Fixed an edge case where ``br`` tag attributes weren't removed if the br tag
appears first.
|
2024-02-07 21:12:24 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-html-sanitizer: updated to 2.3.0
2.3 (2024-02-07)
- Avoided adding whitespace when merging tags of the same type.
- Updated the tests.
- Switched from black to the ruff formatter.
|
2023-11-27 21:21:00 by Adam Ciarcinski | Files touched by this commit (3) | |
Log message:
py-html-sanitizer: updated to 2.2.0
2.2 (2023-07-03)
- Changed ``keep_normalized_whitespace`` to preserve whitespace at the tail of
tags, not just between tags.
- Changed the parameters of ``normalize_whitespace_in_text_or_tail`` to be
keyword-only.
2.1 (2023-06-29)
- Added a test for a type of misconfiguration.
- Changed the sanitizer configuration validation to not allow unexpected data
types in ``tags``, ``empty``, ``separate``, ``whitespace`` and
``attributes``.
2.0 (2023-06-28)
- Raised the minimum Python version to 3.7. Added Python 3.10, 3.11.
- Raised the minimum lxml version to the current 4.9.1.
- Switched from Travis CI to GitHub actions. Added Python 3.9 to the CI
matrix.
- Renamed the main branch to main.
- Switched to a declarative setup.
- Fixed a whitespace dependency in the testsuite.
- Switched to hatchling and ruff.
- Made behavior-altering arguments to ``normalize_overall_whitespace``
keyword-only.
|
2022-11-30 17:43:32 by Adam Ciarcinski | Files touched by this commit (2) | |
Log message:
py-html-sanitizer: updated to 1.9.3
1.9 (2020-01-20)
Added Python 3.8 to the CI matrix.
Be able to keep the <style> tag by adding it to tags.
Added a style check to the CI matrix.
1.8 (2019-11-21)
Actually added support for customizing lxml's autolinking behavior using a \
dictionary argument.
Stopped removing explicitly allowed attributes.
Removed id from allowed attributes of <a> tags to provide an additional \
layer of defense against DOM clobbering attacks.
Added an element preprocessor which assigns the id value to the name attribute \
of anchors if name isn't set or empty. This should provide additional backwards \
compatibility making the id removal less of a problem when using named anchors.
1.7 (2019-02-19)
Added a system check which validates sanitizer configurations early when using \
Django.
Fixed an edge case where passing in an empty allowed tags list would \
unexpectedly and silently not remove any tags at all (because that's the way \
lxml's cleaner works).
Changed the sanitizer tags, empty and separate options to also accept any \
iterable, not just sets.
Changed the lru_cache import in the Django module to try functools first.
Fixed the tag merging to also check tags in empty. This means that e.g. \
consecutive <hr> tags are also merged now when using the default settings.
Made it possible to override the set of tags processed as whitespace. The \
default set is {"br"} which preserves the current behavior of \
stripping breaks from the beginning or end of tags' content.
|
2022-11-09 14:14:32 by Joerg Sonnenberger | Files touched by this commit (223) |
Log message:
Reset MAINTAINER
|
2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595) |
Log message:
*: bump PKGREVISION for egg.mk users
They now have a tool dependency on py-setuptools instead of a DEPENDS
|
2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161) |
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums
All checksums have been double-checked against existing RMD160 and
SHA512 hashes
Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
|
2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162) |
Log message:
textproc: Remove SHA1 hashes for distfiles
|