./textproc/py-nltk, Natural Language Toolkit (NLTK)

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 3.8.1nb1, Package name: py312-nltk-3.8.1nb1, Maintainer: pkgsrc-users

NLTK - the Natural Language Toolkit - is a suite of open source
Python modules, data and documentation for research and development
in natural language processing. NLTK contains code supporting dozens
of NLP tasks, along with 30 popular Corpora and extensive documentation
including a 360-page online book.


Required to run:
[devel/py-setuptools] [databases/py-sqlite3] [devel/py-pyparsing] [devel/py-click] [textproc/py-regex] [misc/py-tqdm] [devel/py-joblib] [lang/python310]

Master sites:

Filesize: 4512.098 KB

Version history: (Expand)


CVS history: (Expand)


   2023-08-02 01:20:57 by Thomas Klausner | Files touched by this commit (158)
Log message:
*: remove more references to Python 3.7
   2023-07-01 10:37:47 by Thomas Klausner | Files touched by this commit (105) | Package updated
Log message:
*: restrict py-numpy users to 3.9+ in preparation for update
   2023-05-03 11:53:50 by Thomas Klausner | Files touched by this commit (1)
Log message:
py-nltk: use nltk_data-omw-1.4
   2023-05-02 22:45:06 by Thomas Klausner | Files touched by this commit (2) | Package updated
Log message:
py-nltk: update to 3.8.1.

Version 3.8.1 2023-01-02

* Resolve RCE vulnerability in localhost WordNet Browser (#3100)
* Remove unused tool scripts (#3099)
* Resolve XSS vulnerability in localhost WordNet Browser (#3096)
* Add Python 3.11 support (#3090)
   2022-12-16 00:15:24 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-nltk: updated to 3.8

Version 3.8 2022-12-12

* Refactor dispersion plot
* Provide type hints for LazyCorpusLoader variables
* Throw warning when LanguageModel is initialized with incorrect vocabulary
* Fix WordNet's all_synsets() function
* Resolve TreebankWordDetokenizer inconsistency with end-of-string contractions
* Support both iso639-3 codes and BCP-47 language tags
* Avoid DeprecationWarning in Regexp tokenizer
* Fix many doctests, add doctests to CI
* Fix bool field not being read in VerbNet
* Greatly improve time efficiency of SyllableTokenizer when tokenizing numbers
* Fix encodings of Polish udhr corpus reader
* Allow TweetTokenizer to tokenize emoji flag sequences
* Prevent LazyModule from increasing the size of nltk.__dict__
* Fix CoreNLPServer non-default port issue
* Add "acion" suffix to the Spanish SnowballStemmer
* Allow loading WordNet without OMW
* Use input() in nltk.chat.chatbot() for Jupyter support
* Fix edit_distance_align() in distance.py
* Tackle performance and accuracy regression of sentence tokenizer since NLTK 3.6.6
* Add the Iota operator to semantic logic
* Resolve critical errors in WordNet app
* Resolve critical error in CHILDES Corpus
* Make WordNet information_content() accept adjective satellites
* Add "strict=True" parameter to CoreNLP
* Resolve issue with WordNet's synset_from_sense_key
* Handle WordNet synsets that were lost in mapping
* Resolve TypeError in Boxer
* Add function to retrieve WordNet synonyms
* Warn about nonexistent OMW offsets instead of raising an error
* Fix missing ic argument in res, jcn and lin similarity functions of WordNet
* Add support for the extended OMW
* Fix LC cutoff policy of text tiling
* Optimize ConditionalFreqDist.__add__ performance
* Add Markdown corpus reader
   2022-11-30 11:46:00 by Adam Ciarcinski | Files touched by this commit (1)
Log message:
py-nltk: add ALTERNATIVES
   2022-11-29 18:09:45 by Adam Ciarcinski | Files touched by this commit (3) | Package updated
Log message:
py-nltk: updated to 3.7

NLTK 3.7 release: February 2022:

improve and update the NLTK team page on nltk.org

drop support for Python 3.6

add support for Python 3.10
   2022-05-15 12:05:16 by Nia Alarie | Files touched by this commit (3)
Log message:
*: py37 incompatibility via matplotlib via numpy