./textproc/py-Unidecode, ASCII transliterations of Unicode text

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.3.8, Package name: py311-Unidecode-1.3.8, Maintainer: pkgsrc-users

It often happens that you have text data in Unicode, but you need
to represent it in ASCII. For example when integrating with legacy
code that doesn't support Unicode, or for ease of entry of non-Roman
names on a US keyboard, or when constructing ASCII machine identifiers
from human-readable Unicode strings that should still be somewhat
intelligeble (a popular example of this is when making an URL slug
from an article title).

Note that this module generally produces better results than simply
stripping accents from characters (which can be done in Python with
built-in functions). It is based on hand-tuned character mappings
that for example also contain ASCII approximations for symbols and
non-Latin alphabets.

This is a Python port of Text::Unidecode Perl module by Sean M.
Burke.


Required to run:
[devel/py-setuptools] [lang/python310]

Master sites:

Filesize: 188.185 KB

Version history: (Expand)


CVS history: (Expand)


   2024-01-11 14:06:36 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-Unidecode: updated to 1.3.8

unidecode 1.3.8
* Fix replacement for U+1E9E "LATIN CAPITAL LETTER SHARP S"
   2023-09-27 11:02:06 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-Unidecode: updated to 1.3.7

unidecode 1.3.7
* Add missing replacements for katakana punctuation (thanks to
  Emil Hammarberg)
* Fix replacement for U+1F19C "SQUARED SECOND SCREEN".
* Fix replacement for U+1F1A9 "SQUARED LOSSLESS".
* Add more replacements for symbols in the U+21xx and
  U+1F1xx pages (thanks to @cheznewa on GitHub)
* Remove old __init__.pyi from the Wheel package that was included due
  to a bug in the build script.
   2022-10-31 22:58:01 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-Unidecode: updated to 1.3.6

unidecode 1.3.6
* No changes. Re-upload to PyPi.

unidecode 1.3.5
* Remove trailing space in replacements for vulgar fractions.
* This release was yanked from PyPi, because the Wheel package
  contained the wrong version of the code and was incompatible
  with Python 3.5. .tar.gz package was not affected.

unidecode 1.3.4
* Add some missing replacements for symbols in the U+21xx and
  U+1F1xx pages (thanks to @cheznewa on GitHub)

unidecode 1.3.3
* Command-line utility now reads input line-by-line, making
  it usable with large files (thanks to Jan-Thorsten Peter)

unidecode 1.3.2
* Re-upload because PyPI was missing Requires-Python metadata for
  the .tar.gz package.

unidecode 1.3.1
* Fix issue with wheel package falsely claiming support for Python 2.

unidecode 1.3.0
* Drop support for Python <3.5.
* Improvements to Hebrew and Yiddish transliterations (thanks to Alon
  Bar-Lev and @eyaler on GitHub)
   2022-01-05 16:41:32 by Thomas Klausner | Files touched by this commit (289)
Log message:
python: egg.mk: add USE_PKG_RESOURCES flag

This flag should be set for packages that import pkg_resources
and thus need setuptools after the build step.

Set this flag for packages that need it and bump PKGREVISION.
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message:
*: bump PKGREVISION for egg.mk users

They now have a tool dependency on py-setuptools instead of a DEPENDS
   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles
   2021-02-05 20:28:06 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-Unidecode: updated to 1.2.0

unidecode 1.2.0
* Add 'errors' argument that specifies how characters with unknown
  replacements are handled. Default is 'ignore' to replicate the
  behavior of older versions.
* Many characters that were previously replaced with '[?]' are now
  correctly marked as unknown and will behave as specified in the
  new errors='...' argument.
* Added some missing ligatures and quotation marks in U+1F6xx and
  U+27xx ranges.
* Add PEP 561-style type information (thanks to Pascal Corpet)
* Support for Python 2 and 3.5 to be removed in next release.