Subject: CVS commit: pkgsrc/textproc/sentencepiece
From: Thomas Klausner
Date: 2023-03-13 15:17:12
Message id: 20230313141712.67FD7FA90@cvs.NetBSD.org

Log Message:
textproc/sentencepiece: import sentencepiece-0.1.97

SentencePiece is an unsupervised text tokenizer and detokenizer
mainly for Neural Network-based text generation systems where the
vocabulary size is predetermined prior to the neural model training.
SentencePiece implements subword units (e.g., byte-pair-encoding
(BPE)) and unigram language model with the extension of direct
training from raw sentences. SentencePiece allows us to make a
purely end-to-end system that does not depend on language-specific
pre/postprocessing.

Files:
RevisionActionfile
1.1addpkgsrc/textproc/sentencepiece/DESCR
1.1addpkgsrc/textproc/sentencepiece/Makefile
1.1addpkgsrc/textproc/sentencepiece/Makefile.common
1.1addpkgsrc/textproc/sentencepiece/PLIST
1.1addpkgsrc/textproc/sentencepiece/buildlink3.mk
1.1addpkgsrc/textproc/sentencepiece/distinfo