./converters/py-chardet, Character encoding auto-detection in Python

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 3.0.4, Package name: py27-chardet-3.0.4, Maintainer: bartosz.kuzma

Character encoding auto-detection in Python.

Required to run:
[devel/py-setuptools] [lang/python27]

Required to build:

Master sites:

SHA1: 4766fb07e700945a7085d073257f1f320d037ce8
RMD160: 03913482c682bf5e2b872d7f0a25d44fc1df9a47
Filesize: 1824.661 KB

Version history: (Expand)

CVS history: (Expand)

   2018-08-21 00:36:21 by Ryosuke Moro | Files touched by this commit (6)
Log message:
   2017-09-03 10:53:18 by Thomas Klausner | Files touched by this commit (165)
Log message:
Follow some redirects.
   2017-06-09 00:19:14 by Thomas Klausner | Files touched by this commit (1)
Log message:
Remove outdated comment, tests work.
   2017-06-08 21:06:52 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
chardet 3.0.4
This minor bugfix release just fixes some packaging and documentation issues:
* Fix issue with setup.py where pytest_runner was always being installed.
* Make sure test.py is included in the manifest
* Fix a bunch of old URLs in the README and other docs.
* Update documentation to no longer imply we test/support Python 3 versions \ 
before 3.3
   2017-05-17 09:09:53 by Adam Ciarcinski | Files touched by this commit (2)
Log message:
Changes 3.0.3:
This release fixes a crash when debugging logging was enabled.
   2017-04-23 18:08:02 by Thomas Klausner | Files touched by this commit (1)
Log message:
Add missing unused test dependency.

See also https://github.com/chardet/chardet/issues/120
   2017-04-19 19:24:16 by Thomas Klausner | Files touched by this commit (3) | Package updated
Log message:
Updated py-chardet to 3.0.2.

chardet 3.0.2

Fixes an issue where detect would sometimes return None instead of a dict with \ 
the keys encoding, language, and confidence (Issue #113, PR #114).

chardet 3.0.1

This bugfix release fixes a crash in the EUC-TW prober when it encountered \ 
certain strings (Issue #67).

chardet 3.0.0

This release is long overdue, but still mostly serves as a placeholder
for the impending 4.0.0 release, which will have retrained models
for better accuracy. For now, this release will get the following
improvements up on PyPI:

    Added support for Turkish ISO-8859-9 detection (PR #41, thanks @queeup)
    Commented out large unused sections of Big5 and EUC-KR tables to save memory \ 
    Removed Python 3.2 from testing, but add 3.4 - 3.6
    Ensure that stdin is open with mode 'rb' for chardetect CLI. (PR #38, thanks \ 
    Fixed chardetect crash with non-ascii file names (PR #39, thanks @nkanaev)
    Made naming conventions more Pythonic throughout (no more \ 
mTypicalPositiveRatio, and instead typical_positive_ratio)
    Modernized test scripts and infrastructure so we've got Travis testing and \ 
all that stuff
    Rename filter_without_english_words to filter_international_words and make \ 
it match current Mozilla implementation (PR #44, thanks @rsnair2)
    Updated filter_english_letters to match C implementation (c665459)
    Temporarily disabled Hungarian ISO-8859-2 and Windows-1250 detection because \ 
it is very inaccurate (da6c0a0)
    Allow CLI sub-package to be importable (PR #55)
    Add a hypotheis-based test (PR #66, thanks @DRMacIver)
    Strip endianness from UTF with BOM predictions so that the encoding can be \ 
passed directly to bytes.decode() (PR #73, thanks @snoack)
    Fixed broken links in docs (PR #90, thanks @roskakori)
    Added early exit to chardetect when encoding is detected instead of looping \ 
through entire file (PR #103, thanks @jpz)
    Use bytearray objects internally instead of wrap_ord calls, which provides a \ 
nice performance boost across the board (PR #106)
    Add language property to probers and UniversalDetector results (PR #180)
    Mark the 5 known test failures as such so we can have more useful Travis \ 
build results in the meantime (d588407)
   2017-01-03 14:23:05 by Jonathan Perkin | Files touched by this commit (52)
Log message:
Use "${MV} || ${TRUE}" and "${RM} -f" consistently in \ 
post-install targets.