./textproc/py-html2text, Convert HTML into easy-to-read plain ASCII text

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 2019.9.26, Package name: py37-html2text-2019.9.26, Maintainer: schmonz

html2text is a Python script that convers a page of HTML into clean,
easy-to-read plain ASCII text. Better yet, that ASCII also happens
to be valid Markdown (a text-to-HTML format).


Required to run:
[devel/py-setuptools] [lang/python37]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: 40925f552ef9b67cb4d67086d532fb7ecf4f4c4b
RMD160: 4acefb71dd07c49b517f0747b8d6db8ea0cbab89
Filesize: 47.494 KB

Version history: (Expand)


CVS history: (Expand)


   2019-10-09 23:26:12 by Olaf Seibert | Files touched by this commit (1)
Log message:
Mark package incompatible with Python 2.7 (anything < 3.4).
It is not noted clearly in the release notes; see commit
https://github.com/Alir3z4/html2text/co … 49f5e0c6e3
   2019-09-29 14:18:42 by Amitai Schleier | Files touched by this commit (3) | Package updated
Log message:
Update to 2019.9.26. From the changelog:

* Fix long blockquotes wrapping.
* Remove the trailing whitespaces that were added after wrapping list items \ 
& blockquotes.
* Fix memory leak when processing a document containing a ``<abbr>`` tag.
* Fix ``AttributeError`` when reading text from stdin.
* Fix ``UnicodeEncodeError`` when writing output to stdout.

Updating during the freeze for the bugfixes.
   2019-08-17 17:12:14 by Amitai Schleier | Files touched by this commit (3) | Package updated
Log message:
Update to 2019.8.11. From the changelog:

* Add support for wrapping list items.
* Fix #201: handle &lrm;/&rlm; marks mid-text within stressed tags or
  right after stressed tags.
* Feature #213: ``images_as_html`` config option to always generate an
  ``img`` html tag. preserves "height", "width" and \ 
"alt" if possible.
* Remove support for end-of-life Pythons. Now requires Python 2.7 or 3.4+.
* Remove support for retrieving HTML over the network.
* Add ``__main__.py`` module to allow running the CLI using
  ``python -m html2text ...``.
* Fix #238: correct spacing when a HTML entity follows a non-stressed
  tags which follow a stressed tag.
* Remove unused or deprecated:
  * ``html2text.compat.escape()``
  * ``html2text.config.RE_UNESCAPE``
  * ``html2text.HTML2Text.replaceEntities()``
  * ``html2text.HTML2Text.unescape()``
  * ``html2text.unescape()``
* Fix #208: handle LEFT-TO-RIGHT MARK after a stressed tag.
   2018-04-14 11:02:57 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-html2text: updated to 2018.9.1

2018.9.1
Fix: Non-ASCII in title attribute causes encode error.
Feature: Add support for the <kbd> tag.
Feature: Add support for the <q> tag.
   2017-10-25 06:09:46 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-html2text: updated to 2017.10.4

2017.10.4
* Fix 157: Fix images link with div wrap
* Fix 55: Fix error when empty title tags
* Fix 160: The html2text tests are failing on Windows and on Cygwin due to \ 
differences in eol handling between windows/*nix
* Feature 164: Housekeeping: Add flake8 to the travis build, cleanup existing \ 
flake8 violations, add py3.6 and pypy3 to the travis build
* Fix 109: Fix for unexpanded &lt; &gt; &amp;
* Fix 143: Fix line wrapping for the lines starting with bold
* Adds support for numeric bold text indication in `font-weight`,
  as used by Google (and presumably others.)
* Fix 173 and 142: Stripping whitespace in crucial markdown and adding \ 
whitespace as necessary
* Don't drop any cell data on tables uneven row lengths (e.g. colspan in use)
   2017-01-03 14:23:05 by Jonathan Perkin | Files touched by this commit (52)
Log message:
Use "${MV} || ${TRUE}" and "${RM} -f" consistently in \ 
post-install targets.
   2016-10-05 11:18:36 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
Updated py-html2text to 2016.9.19.

2016.9.19
=========
----

* Default image alt text option created and set to a default of empty string \ 
"" to maintain backward compatibility
* Fix #136: --default-image-alt now takes a string as argument
* Fix #113: Stop changing quiet levels on \/script tags.
* Merge #126: Fix deprecation warning on py3 due to html.escape
* Fix #145: Running test suite on Travis CI for Python 2.6.
   2016-08-28 17:48:37 by Thomas Klausner | Files touched by this commit (112)
Log message:
Remove unnecessary PLIST_SUBST and FILES_SUBST that are now provided
by the infrastructure.

Mark a couple more packages as not ready for python-3.x.