./textproc/py-Unidecode, ASCII transliterations of Unicode text

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.2.0, Package name: py38-Unidecode-1.2.0, Maintainer: pkgsrc-users

It often happens that you have text data in Unicode, but you need
to represent it in ASCII. For example when integrating with legacy
code that doesn't support Unicode, or for ease of entry of non-Roman
names on a US keyboard, or when constructing ASCII machine identifiers
from human-readable Unicode strings that should still be somewhat
intelligeble (a popular example of this is when making an URL slug
from an article title).

Note that this module generally produces better results than simply
stripping accents from characters (which can be done in Python with
built-in functions). It is based on hand-tuned character mappings
that for example also contain ASCII approximations for symbols and
non-Latin alphabets.

This is a Python port of Text::Unidecode Perl module by Sean M.
Burke.


Required to run:
[devel/py-setuptools] [lang/python37]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: bec2cc868c2429c895368d1a6646855c1eafee1e
RMD160: ea53ed4b2322cc7f0f47461c268170bcae518893
Filesize: 210.979 KB

Version history: (Expand)


CVS history: (Expand)


   2021-02-05 20:28:06 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-Unidecode: updated to 1.2.0

unidecode 1.2.0
* Add 'errors' argument that specifies how characters with unknown
  replacements are handled. Default is 'ignore' to replicate the
  behavior of older versions.
* Many characters that were previously replaced with '[?]' are now
  correctly marked as unknown and will behave as specified in the
  new errors='...' argument.
* Added some missing ligatures and quotation marks in U+1F6xx and
  U+27xx ranges.
* Add PEP 561-style type information (thanks to Pascal Corpet)
* Support for Python 2 and 3.5 to be removed in next release.
   2020-12-21 10:25:32 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-Unidecode: updated to 1.1.2

unidecode 1.1.2
* Add some missing replacements in the U+23xx page.
* Fix U+204A "TIRONIAN SIGN ET" replacement.
   2019-07-02 12:21:09 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-Unidecode: updated to 1.1.1

unidecode 1.1.1
* Fix tests failing on PyPy 7.1.1
   2019-06-18 16:40:48 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-Unidecode: updated to 1.1.0

unidecode 1.1.0
* Add more Latin letter variants in U+1F1xx page.
* Make it possible to use the Unidecode command-line utility via
  "python -m unidecode"
* General clean up of code and documentation
   2018-11-21 12:00:05 by Adam Ciarcinski | Files touched by this commit (4) | Package updated
Log message:
py-Unidecode: updated to 1.0.23

unidecode 1.0.23
* Improve transliteration of Hebrew letters
* Add transliterations for the phonetic block U+1D00 - U+1D7F
* Transliterate SI "micro" prefix as "u" instead of \ 
"micro" in the
  U+33xx block.
* Add U+33DE SQUARE V OVER M and U+33DF SQUARE A OVER M.
* Drop support for Python 2.6 and 3.3
   2018-02-26 11:55:46 by Adam Ciarcinski | Files touched by this commit (1) | Package updated
Log message:
py-Unidecode: updated CATEGORIES and HOMEPAGE
   2018-01-28 17:36:59 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-Unidecode: update to 1.0.22.

2018-01-05	unidecode 1.0.22
	* Move to semantic version numbering, no longer following version
	  numbers from the original Perl module. This fixes an issue with
	  setuptools (>= 8) and others expecting major.minor.patch format.
	  (https://github.com/avian2/unidecode/issues/13)
	* Add transliterations for currency signs U+20B0 through U+20BF
	  (thanks to Mike Swanson)
	* Surround transliterations of vulgar fractions with spaces to avoid
	  incorrect combinations with adjacent numerals
	  (thanks to Jeffrey Gerard)
   2017-09-04 20:08:31 by Thomas Klausner | Files touched by this commit (163)
Log message:
Follow some redirects.