./textproc/py-Unidecode, ASCII transliterations of Unicode text

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 0.04.21, Package name: py27-Unidecode-0.04.21, Maintainer: pkgsrc-users

It often happens that you have text data in Unicode, but you need
to represent it in ASCII. For example when integrating with legacy
code that doesn't support Unicode, or for ease of entry of non-Roman
names on a US keyboard, or when constructing ASCII machine identifiers
from human-readable Unicode strings that should still be somewhat
intelligeble (a popular example of this is when making an URL slug
from an article title).

Note that this module generally produces better results than simply
stripping accents from characters (which can be done in Python with
built-in functions). It is based on hand-tuned character mappings
that for example also contain ASCII approximations for symbols and
non-Latin alphabets.

This is a Python port of Text::Unidecode Perl module by Sean M.
Burke.


Required to run:
[devel/py-setuptools] [lang/python27]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: a6c0f413bfc5d9de7bf7807b7b7589cea55f7b6b
RMD160: 2c30cc6d15f2761ce7874710c9920968f270605f
Filesize: 201.104 KB

Version history: (Expand)


CVS history: (Expand)


   2017-09-04 20:08:31 by Thomas Klausner | Files touched by this commit (163)
Log message:
Follow some redirects.
   2017-07-23 21:02:53 by Adam Ciarcinski | Files touched by this commit (1)
Log message:
Removed - from ALTERNATIVES
   2017-07-23 20:52:03 by Adam Ciarcinski | Files touched by this commit (4)
Log message:
unidecode 0.04.21
* Add U+2116 NUMERO SIGN
* Add U+05BE HEBREW PUNCTUATION MAQAF

unidecode 0.04.20:
* Fixed transliteration of circled Latin letters and numbers
* Add square unit symbols.
* Add Latin variants in U+20xx and U+21xx pages.
* Fix U+02B1 MODIFIER LETTER SMALL H WITH HOOK.
* Fix U+205F MEDIUM MATHEMATICAL SPACE.
* Add "DIGIT ... COMMA" and "PARANTHESIZED LATIN CAPITAL LETTER"
  in U+1F1xx page.
* Add missing vulgar fractions and a/c, a/s, c/o, c/u symbols.
* Add universal Wheel release
   2016-06-08 19:43:49 by Thomas Klausner | Files touched by this commit (356)
Log message:
Switch to MASTER_SITES_PYPI.
   2015-11-04 03:00:17 by Alistair G. Crooks | Files touched by this commit (797)
Log message:
Add SHA512 digests for distfiles for textproc category

Problems found locating distfiles:
	Package cabocha: missing distfile cabocha-0.68.tar.bz2
	Package convertlit: missing distfile clit18src.zip
	Package php-enchant: missing distfile php-enchant/enchant-1.1.0.tgz

Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden).  All existing
SHA1 digests retained for now as an audit trail.
   2012-10-25 08:57:09 by Aleksej Saushev | Files touched by this commit (587)
Log message:
Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.
   2012-05-30 13:03:50 by Thomas Klausner | Files touched by this commit (4) | Imported package
Log message:
Initial import of py-Unidecode-0.04.9:

It often happens that you have text data in Unicode, but you need
to represent it in ASCII. For example when integrating with legacy
code that doesn't support Unicode, or for ease of entry of non-Roman
names on a US keyboard, or when constructing ASCII machine identifiers
from human-readable Unicode strings that should still be somewhat
intelligeble (a popular example of this is when making an URL slug
from an article title).

Note that this module generally produces better results than simply
stripping accents from characters (which can be done in Python with
built-in functions). It is based on hand-tuned character mappings
that for example also contain ASCII approximations for symbols and
non-Latin alphabets.

This is a Python port of Text::Unidecode Perl module by Sean M.
Burke.