Path to this page:
./
math/R-stringdist,
Approximate String Matching and String Distance Functions
Branch: CURRENT,
Version: 0.9.10,
Package name: R-stringdist-0.9.10,
Maintainer: pkgsrc-usersImplements an approximate string matching version of R's native
'match' function. Can calculate various string distances based on
edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting
alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic
metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided
as well. Distances can be computed between character vectors while
taking proper care of encoding or between integer vectors representing
generic sequences. This package is built for speed and runs in
parallel by using 'openMP'. An API for C or C++ is exposed as well.
Required to run:[
math/R]
Required to build:[
pkgtools/cwrappers]
Master sites: (Expand)
Version history: (Expand)
- (2023-06-02) Updated to version: R-stringdist-0.9.10
- (2021-09-18) Updated to version: R-stringdist-0.9.8
- (2020-02-10) Package added to pkgsrc.se, version R-stringdist-0.9.5.5 (created)
CVS history: (Expand)
2023-06-02 15:49:35 by Makoto Fujiwara | Files touched by this commit (2) |
Log message:
(math/R-stringdist) Updated 0.9.8 to 0.9.10
(from NEWS, no info on 0.9.10)
version 0.9.9
- Fixed warnings generated by new C compiler. (function prototypes must
now be defined completely). (Thanks to Kurt Hornik for the head's up.)
|
2021-10-26 12:56:13 by Nia Alarie | Files touched by this commit (458) |
Log message:
math: Replace RMD160 checksums with BLAKE2s checksums
All checksums have been double-checked against existing RMD160 and
SHA512 hashes
|
2021-10-07 16:28:36 by Nia Alarie | Files touched by this commit (458) |
Log message:
math: Remove SHA1 hashes for distfiles
|
2021-09-18 14:38:42 by Makoto Fujiwara | Files touched by this commit (2) | |
Log message:
(math/R-stringdist) Updated 0.9.5.5 to 0.9.8
version 0.9.8
- Fixed some issues on C-level causing problems with the
CLANG compiler. (Thanks to Brian Ripley for not only
reporting this, but also sending updated code with
fixes).
version 0.9.7
- Fixes in use of INTEGER() and VECTOR_ELT() after updates in R's C API.
this affected 'afind' and 'max_length' (internally). (Thanks to Luke
Tierny and Kurt Hornik for the notification).
- Fix in 'amatch' causing utf-8 characters to be ignored in some
cases (thanks to Joan Mime for reporting #78).
- Fix: segfault when 'afind' was called with many search patterns or many
texts to be searched.
- Fix: stringsimmatrix was not normalized correctly (Thanks to Tamas Ferenci
for reporting GH).
version 0.9.6.3
- Resubmit. Fixed an URL redirect that was detected by CRAN.
version 0.9.6.2
- Resubmit. Fixed url issues detected by CRAN, added doi to description
as per CRAN request.
version 0.9.6.1
- Bugfix: afind/grab/grabl returned wrong results on MacOS only.
(thanks to Prof. Brian Ripley for the notification and for running tests
on his personal machine and to Tomas Kalibera for making the
ubuntu-rchk docker image available).
version 0.9.6
- New function 'afind': find approximate matches in text based on string distance.
- New functions 'grab', 'grabl': fuzzy matching equivalent to 'grep' and 'grepl'.
- New function 'extract': fuzzy matching equivalent of stringr::str_extract.
- New algorithm 'running_cosine': fast fuzzy text search using cosine distance.
- New function 'stringsimmatrix' (Thanks to Johannes Gruber).
- Number of threads used is now reported when loading 'stringdist'.
- Internal fixes (in some cases class() == 'class' was used).
|
2020-02-10 15:21:00 by Makoto Fujiwara | Files touched by this commit (3) |
Log message:
(math/R-stringdist) import R-stringdist-0.9.5.5
Implements an approximate string matching version of R's native
'match' function. Can calculate various string distances based on
edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting
alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic
metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided
as well. Distances can be computed between character vectors while
taking proper care of encoding or between integer vectors representing
generic sequences. This package is built for speed and runs in
parallel by using 'openMP'. An API for C or C++ is exposed as well.
|