./textproc/py-ftfy, Fixes some problems with Unicode text after the fact

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 6.3.1, Package name: py312-ftfy-6.3.1, Maintainer: pkgsrc-users

Given Unicode text, make its representation consistent and possibly less broken.


Required to run:
[devel/py-setuptools] [textproc/py-html5lib] [devel/py-wcwidth] [lang/python310]

Master sites:

Filesize: 301.687 KB

Version history: (Expand)


CVS history: (Expand)


   2025-01-15 13:46:13 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-ftfy: updated to 6.3.1

Version 6.3.1 (October 25, 2024)

- Fixed `license` metadata field in pyproject.toml.
- Removed extraneous files from the `hatchling` sdist output.

Version 6.3.0 (October 8, 2024)

- Switched packaging from poetry to uv.
- Uses modern Python packaging exclusively (no setup.py).
- Added support for mojibake in Windows-1257 (Baltic).
- Detects mojibake for "Ü" in an uppercase word, such as \ 
"ZURÜCK".
- Expanded a heuristic that notices improbable punctuation.
- Fixed a false positive involving two concatenated strings, one of which began \ 
with the § sign.
- Rewrote `chardata.py` to be more human-readable and debuggable, instead of \ 
being full of
  keysmash-like character sets.
   2024-04-26 22:10:48 by Adam Ciarcinski | Files touched by this commit (3)
Log message:
py-ftfy: do not install additional files into site-packages directory
   2024-04-26 18:52:00 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-ftfy: updated to 6.2.0

Version 6.2.0 (March 16, 2024)

- Fixed a case where an en-dash and a space near other mojibake would be
  interpreted (probably incorrectly) as MacRoman mojibake.
- Added [project.urls] metadata to pyproject.toml.
- README contains license clarifications for entitled jerks.
   2024-01-06 20:51:02 by Adam Ciarcinski | Files touched by this commit (4) | Package updated
Log message:
py-ftfy: updated to 6.1.3

Version 6.1.3 (November 21, 2023)

- Updated wcwidth.
- Switched to the Apache 2.0 license.
- Dropped support for Python 3.7.

Version 6.1.2 (February 17, 2022)

- Added type information for `guess_bytes`.

Version 6.1.1 (February 9, 2022)

- Updated the heuristic to fix the letter ß in UTF-8/MacRoman mojibake,
 which had regressed since version 5.6.

- Packaging fixes to pyproject.toml.

Version 6.1 (February 9, 2022)

- Updated the heuristic to fix the letter Ñ with more confidence.

- Fixed type annotations and added py.typed.

- ftfy is packaged using Poetry now, and wheels are created and uploaded to
 PyPI.

Version 6.0.3 (May 14, 2021)

- Allow the keyword argument `fix_entities` as a deprecated alias for
 `unescape_html`, raising a warning.

- `ftfy.formatting` functions now disregard ANSI terminal escapes when
 calculating text width.

Version 6.0.2 (May 4, 2021)

This version is purely a cosmetic change, updating the maintainer's e-mail
address and the project's canonical location on GitHub.

Version 6.0.1 (April 12, 2021)

- The `remove_terminal_escapes` step was accidentally not being used. This
 version restores it.

- Specified in setup.py that ftfy 6 requires Python 3.6 or later.

- Use a lighter link color when the docs are viewed in dark mode.

Version 6.0 (April 2, 2021)

- New function: `ftfy.fix_and_explain()` can describe all the transformations
 that happen when fixing a string. This is similar to what
 `ftfy.fixes.fix_encoding_and_explain()` did in previous versions, but it
 can fix more than the encoding.

- `fix_and_explain()` and `fix_encoding_and_explain()` are now in the top-level
 ftfy module.

- Changed the heuristic entirely. ftfy no longer needs to categorize every
 Unicode character, but only characters that are expected to appear in
 mojibake.

- Because of the new heuristic, ftfy will no longer have to release a new
 version for every new version of Unicode. It should also run faster and
 use less RAM when imported.

- The heuristic `ftfy.badness.is_bad(text)` can be used to determine whether
 there appears to be mojibake in a string. Some users were already using
 the old function `sequence_weirdness()` for that, but this one is actually
 designed for that purpose.

- Instead of a pile of named keyword arguments, ftfy functions now take in
 a TextFixerConfig object. The keyword arguments still work, and become
 settings that override the defaults in TextFixerConfig.

- Added support for UTF-8 mixups with Windows-1253 and Windows-1254.

- Overhauled the documentation: https://ftfy.readthedocs.org
   2022-01-05 16:41:32 by Thomas Klausner | Files touched by this commit (289)
Log message:
python: egg.mk: add USE_PKG_RESOURCES flag

This flag should be set for packages that import pkg_resources
and thus need setuptools after the build step.

Set this flag for packages that need it and bump PKGREVISION.
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message:
*: bump PKGREVISION for egg.mk users

They now have a tool dependency on py-setuptools instead of a DEPENDS
   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles