./textproc/py-html-sanitizer, White-list based HTML sanitizer

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.6.1, Package name: py27-html-sanitizer-1.6.1, Maintainer: joerg

html-sanitizer is a whitelist-based and very opinionated HTML sanitizer
that can be used both for untrusted and trusted sources. It attempts to
clean up the mess made by various rich text editors and or copy-pasting
to make styling of webpages simpler and more consistent. It builds on the
excellent HTML cleaner in lxml to make the result both valid and safe.

It goes further than pure tag filtering by transforming the HTML
fragments to normalize formatting and drop redundant or pointless tags.


Required to run:
[devel/py-setuptools] [textproc/py-lxml] [lang/python27] [www/py-beautifulsoup4]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: 6d9e7ad1aaea4133413a65f817496194c93688f7
RMD160: f6a259bce6df2720e626deea5ce13a8f4c58b217
Filesize: 13.424 KB

Version history: (Expand)


CVS history: (Expand)


   2018-08-07 10:29:40 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-html-sanitizer: updated to 1.6.1

1.6:
Fixed another edge case where a tag which is allowed to be empty was erroneously \ 
removed if it contained not only whitespace but also a <br> tag.

1.5:
Fixed a few edge whitespace normalization edge cases and a bug where removing an \ 
empty tag removed all whitespace.
Added black for automatically formatting the Python code.
By default, links with target="_blank" get an additional \ 
rel="noopener" attribute (Article by Mathias Bynens). If you're \ 
overriding the list of allowed attributes for anchor tags you must add rel to \ 
your list.
   2018-04-05 15:11:32 by Joerg Sonnenberger | Files touched by this commit (2) | Package updated
Log message:
Update to py-html-sanitzer-1.4.0:

- don't drop <form>-related elements in the lxml cleaner
- allow better control of element merging
- fix some more cases for nested paragraphs
   2018-02-13 22:00:03 by Joerg Sonnenberger | Files touched by this commit (4)
Log message:
Add py-html-sanitizer-1.3.0:

html-sanitizer is a whitelist-based and very opinionated HTML sanitizer
that can be used both for untrusted and trusted sources. It attempts to
clean up the mess made by various rich text editors and or copy-pasting
to make styling of webpages simpler and more consistent. It builds on the
excellent HTML cleaner in lxml to make the result both valid and safe.

It goes further than pure tag filtering by transforming the HTML
fragments to normalize formatting and drop redundant or pointless tags.