./textproc/py-rapidfuzz, Rapid fuzzy string matching

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 3.10.1, Package name: py312-rapidfuzz-3.10.1, Maintainer: pkgsrc-users

RapidFuzz is a fast string matching library for Python and C++, which is using
the string similarity calculations from FuzzyWuzzy.


Master sites:

Filesize: 56623.291 KB

Version history: (Expand)


CVS history: (Expand)


   2024-04-08 07:08:25 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: updated to 3.8.1

3.8.1
Fixed
* use the correct version of `rapidfuzz-cpp` when building against a system \ 
installed version
   2024-04-07 23:45:04 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: update to 3.8.0.

[3.8.0] - 2024-04-06
^^^^^^^^^^^^^^^^^^^^
Added
~~~~~
* added ``process.cpdist`` which allows pairwise comparision of two collection \ 
of inputs

Fixed
~~~~~
- fix some minor errors in the type hints
- fix potentially incorrect results of JaroWinkler when using high prefix weights
   2024-04-06 08:39:59 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: update to 3.7.0.

[3.7.0] - 2024-03-21
^^^^^^^^^^^^^^^^^^^^

Changed
~~~~~~~
* reduce importtime
   2024-03-12 09:09:33 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: update to 3.6.2.

[3.6.2] - 2024-03-05
^^^^^^^^^^^^^^^^^^^^

Changed
~~~~~~~
* upgrade to ``Cython==3.0.9``

Fixed
~~~~~
* upgrade ``rapidfuzz-cpp`` which includes a fix for build issues on some compilers
* fix some issues with the sphinx config
   2023-12-31 22:33:52 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: update to 3.6.1.

[3.6.1] - 2023-12-28
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
- fix overflow error on systems with ``sizeof(size_t) < 8``

[3.6.0] - 2023-12-26
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
- fix pure python fallback implementation of ``fuzz.token_set_ratio``
- properly link with ``-latomic`` if ``std::atomic<uint64_t>`` is not \ 
natively supported

Performance
~~~~~~~~~~~
* add banded implementation of LCS / Indel. This improves the runtime from \ 
``O((|s1|/64) * |s2|)`` to ``O((score_cutoff/64) * |s2|)``

Changed
~~~~~~~
* upgrade to ``Cython==3.0.7``
* cdist for many metrics now returns a matrix of ``uint32`` instead of ``int32`` \ 
by default
   2023-12-25 23:20:00 by S.P.Zeidler | Files touched by this commit (2)
Log message:
py-rapidfuzz: sort out simd for i386
   2023-11-07 10:14:23 by Thomas Klausner | Files touched by this commit (3) | Package updated
Log message:
py-rapidfuzz: update to 3.5.2.

[3.5.2] - 2023-11-02
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* use _mm_malloc/_mm_free on macOS if aligned_alloc is unsupported

[3.5.1] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Fixed
~~~~~
* fix compilation failure on macOS

[3.5.0] - 2023-10-31
^^^^^^^^^^^^^^^^^^^^
Changed
~~~~~~~
* skip pandas ``pd.NA`` similar to ``None``
* add ``score_multiplier`` argument to ``process.cdist`` which allows \ 
multiplying the end result scores
  with a constant factor.
* drop support for Python 3.7

Performance
~~~~~~~~~~~
* improve performance of simd implementation for ``LCS`` / ``Indel`` / ``Jaro`` \ 
/ ``JaroWinkler``
* improve performance of Jaro and Jaro Winkler for long sequences
* implement ``process.extract`` with ``limit=1`` using ``process.extractOne`` \ 
which can be faster

Fixed
~~~~~
* the preprocessing function was always called through Python due to a broken \ 
C-API version check
* fix wraparound issue in simd implementation of Jaro and Jaro Winkler
   2023-10-13 12:14:58 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
py-rapidfuzz: updated to 3.4.0

[3.4.0] - 2023-10-09
Changed
- upgrade to ``Cython==3.0.3``
- add simd implementation for Jaro and Jaro Winkler

[3.3.1] - 2023-09-25
Added
- add missing tag for python 3.12 support

[3.3.0] - 2023-09-11
Changed
- upgrade to ``Cython==3.0.2``
- implement the remaining missing features from the C++ implementation in the \ 
pure Python implementation