Path to this page:
./
textproc/py-html5lib,
HTML5 parser and tokenizer
Branch: pkgsrc-2021Q3,
Version: 1.0.1,
Package name: py38-html5lib-1.0.1,
Maintainer: joerghtml5lib is a pure-python library for parsing HTML. The parser is
designed to handle all flavours of HTML and parses invalid documents
using well-defined error handling rules compatible with the behaviour of
major desktop web browsers.
Output is to a tree structure; the current release supports output to
DOM, ElementTree, lxml and BeautifulSoup tree formats as well as a
simple custom format.
Master sites:
SHA1: 5e1a2c7e18de7d1d0883e223f1733dc6dc796ee2
RMD160: aba1f653b8ac0f8748de4408343eeb80c3589f90
Filesize: 247.03 KB
Version history: (Expand)
- (2021-09-28) Package added to pkgsrc.se, version py38-html5lib-1.0.1 (created)