./converters/py-charset-normalizer, Universal Charset Detector

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 2.1.0, Package name: py310-charset-normalizer-2.1.0, Maintainer: pkgsrc-users

A library that helps you read text from an unknown charset encoding.


Master sites:

Filesize: 79.853 KB

Version history: (Expand)


CVS history: (Expand)


   2022-08-05 15:59:38 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.1.0

2.1.0 (2022-06-19)

Added

Output the Unicode table version when running the CLI with --version

Changed

Re-use decoded buffer for single byte character sets
Fixing some performance bottlenecks

Fixed

Workaround potential bug in cpython with Zero Width No-Break Space located in \ 
Arabic Presentation Forms-B, Unicode 1.1 not acknowledged as space
CLI default threshold aligned with the API threshold

Removed

Support for Python 3.5

Deprecated

Use of backport unicodedata from unicodedata2 as Python is quickly catching up, \ 
scheduled for removal in 3.0
   2022-02-12 18:53:15 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.0.12

2.0.12
Fixed
- ASCII miss-detection on rare cases
   2022-01-31 12:04:38 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.0.11

2.0.11:

Added
- Explicit support for Python 3.11

Changed
- The logging behavior have been completely reviewed, now using only TRACE and \ 
DEBUG levels
   2022-01-07 17:37:10 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.0.10

2.0.10:
Fixed
- Fallback match entries might lead to UnicodeDecodeError for large bytes sequence
   2022-01-05 16:41:32 by Thomas Klausner | Files touched by this commit (289)
Log message:
python: egg.mk: add USE_PKG_RESOURCES flag

This flag should be set for packages that import pkg_resources
and thus need setuptools after the build step.

Set this flag for packages that need it and bump PKGREVISION.
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message:
*: bump PKGREVISION for egg.mk users

They now have a tool dependency on py-setuptools instead of a DEPENDS
   2021-12-11 21:47:41 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.0.9

2.0.9

Changed
- Moderating the logging impact (since 2.0.8) for specific environments

Fixed
- Wrong logging level applied when setting kwarg `explain` to True
   2021-11-25 09:10:29 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-charset-normalizer: updated to 2.0.8

2.0.8
Changed
- Improvement over Vietnamese detection
- MD improvement on trailing data and long foreign (non-pure latin) data
- Efficiency improvements in cd/alphabet_languages from \ 
[@adbar](https://github.com/adbar)
- call sum() without an intermediary list following PEP 289 recommendations from \ 
[@adbar](https://github.com/adbar)
- Code style as refactored by Sourcery-AI
- Minor adjustment on the MD around european words
- Remove and replace SRTs from assets / tests
- Initialize the library logger with a `NullHandler` by default from \ 
[@nmaynes](https://github.com/nmaynes)
- Setting kwarg `explain` to True will add provisionally (bounded to function \ 
lifespan) a specific stream handler