./textproc/py-jellyfish, Python library for doing approximate and phonetic matching of strings

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.1.0, Package name: py312-jellyfish-1.1.0, Maintainer: pkgsrc-users

Jellyfish is a python library for doing approximate and phonetic matching of
strings.

Included Algorithms:

- String comparison:
* Levenshtein Distance
* Damerau-Levenshtein Distance
* Jaro Distance
* Jaro-Winkler Distance
* Match Rating Approach Comparison
* Hamming Distance

- Phonetic encoding:
* American Soundex
* Metaphone
* NYSIIS (New York State Identification and Intelligence System)
* Match Rating Codex


Required to run:
[lang/python310]

Master sites:

Filesize: 355.851 KB

Version history: (Expand)


CVS history: (Expand)


   2024-02-04 23:37:10 by Adam Ciarcinski | Files touched by this commit (1)
Log message:
py-jellyfish: add cargo-depends.mk
   2024-02-03 18:16:02 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-jellyfish: updated to 1.0.3

1.0.3 - 17 November 2023
-----------------------
* `match_rating_codex` now raises a `ValueError` when passed non-alpha characters
* adds prebuilt wheels for Python 3.12

1.0.1 - 18 September 2023
-------------------------
* fully remove deprecated names
* add armv7 linux builds
* fully drop Python 3.7 support

1.0.0 - 21 June 2023
--------------------
* bump to 1.0 (no notable changes from 0.11.2)

0.11.2 - 2 April 2023
---------------------
* fix to Rust build process to build more wheels, thanks @MartinoMensio!
* switch to using `ahash` for Damerau-Levenshtein for speed gains

0.11.1 - 30 March 2023
----------------------
* fix missing testdata in packages

0.11.0 - 27 March 2023
----------------------
* switched to using Rust implementation for all algorithms

0.10.0 - 25 March 2023
---------------------
* removed rarely-used `porter_stem` function, better implementations exist

0.9.0 - 7 January 2021
----------------------
* updated documentation available at https://jamesturk.github.io/jellyfish/
* support for Python 3.10+
* handle spaces correctly in MRA algorithm

0.8.9 - 26 October 2021
-----------------------
* fix buffer overflow in NYSIIS
* remove unnecessary/undocumented special casing of digits in Jaro-Winkler

0.8.8 - 17 August 2021
----------------------
* release fix to fix Linux wheel issue

0.8.7 - 16 August 2021
----------------------
* safer allocations from CJellyfish
* include aarch64 wheels

0.8.4 - 4 August 2021
---------------------
* fix for jaro winkler

0.8.3 - 11 March 2021
---------------------
* build changes
* include OSX and Windows wheels

0.8.2 - 21 May 2020
-------------------
* fix jaro_winkler/jaro_winkler_similarity mix-up
* deprecate jaro_distance in favor of jaro_similarity
  backwards compatible shim left in place, will be removed in 1.0
* (note: 0.8.1 was a broken release without proper C libraries)

0.8.0 - 21 May 2020
-------------------
* rename jaro_winkler to jaro_winkler_similarity to match other functions
  backwards compatible shim added, but will be removed in 1.0
* fix soundex bug with W/H cases
* fix metaphone bug with WH prefix
* fix C match rating codex bug with duplicate letters
* fix metaphone bug with leading vowels and 'kn' pair
* fix Python jaro_winkler bug
* fix Python 3.9 deprecation warning
* add manylinux wheels
   2022-01-04 21:55:40 by Thomas Klausner | Files touched by this commit (1595)
Log message:
*: bump PKGREVISION for egg.mk users

They now have a tool dependency on py-setuptools instead of a DEPENDS
   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles
   2020-01-03 09:37:57 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-jellyfish: updated to 0.7.2

0.7.2:
* fix CJellyfish damerau_levenshtein w/ unicode, thanks to immerrr
* fix final H in NYSIIS

0.7.1:
* restrict install to Python >= 3.4

0.7.0:
* drop Python 2 compatibility & legacy code
* add bugfix for NYSIIS for words starting with PF
   2018-04-30 08:43:15 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-jellyfish: updated to 0.6.1

0.6.1:
fixed wheel release issue
   2018-04-09 09:34:15 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-jellyfish: updated to 0.6.0

0.6.0:
fix quite a few bugs & differences between C/Py implementations
add wagner-fischer testdata
uppercase soundex result
better error handling in nysiis, soundex, and jaro