./wip/py-nltk, Natural Language Toolkit (NLTK)

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 3.2.2, Package name: py27-nltk-3.2.2, Maintainer: pkgsrc-users

NLTK - the Natural Language Toolkit - is a suite of open source
Python modules, data and documentation for research and development
in natural language processing. NLTK contains code supporting dozens
of NLP tasks, along with 30 popular Corpora and extensive documentation
including a 360-page online book.


Required to run:
[textproc/py-expat] [textproc/py-yaml] [devel/py-setuptools] [databases/py-sqlite3] [devel/py-nose] [lang/python27]

Required to build:
[pkgtools/cwrappers]

Master sites:

SHA1: 1c7418646abcf2a421e552a5a7bce320ef6963a0
RMD160: 9e30afc620849d35cd85a6db06dc5d0a69309de2
Filesize: 2824.263 KB

Version history: (Expand)


CVS history: (Expand)


   2015-09-05 11:23:44 by Thomas Klausner | Files touched by this commit (3) | Package updated
Log message:
Update to 3.0.4:

Version 3.0.4 2015-07-13
* minor bug fixes and enhancements

Thanks to the following contributors to 3.0.4:
Nicola Bova, Santiago Castro, Len Remmerswaal, Keith Suderman, kabayan55,
pln-fing-udelar (NLP Group, Instituto de Computación, Facultad de Ingeniería, \ 
Universidad de la República, Uruguay).

	
Version 3.0.3 2015-06-12
* bug fixes (Stanford NER, Boxer, Snowball, treebank tokenizer,
    dependency graph, KneserNey, BLEU)
* code clean-ups
* default POS tagger permits tagset to be specified
* gensim illustration
* tgrep implementation
* added PanLex Swadesh corpora
* visualisation for aligned bitext
* support for Google App Engine
* POSTagger renamed StanfordPOSTagger, NERTagger renamed StanfordNERTagger

Thanks to the following contributors to 3.0.3:

Long Duong, Pedro Fialho, Dan Garrette, Helder, Saimadhav Heblikar,
Chris Inskip, David Kamholz, Dmitrijs Milajevs, Smitha Milli,
Tom Mortimer-Jones, Avital Pekker, Jonathan Pool, Sam Raker,
Will Roberts, Dmitry Sadovnychyi, Nathan Schneider Anirudh W
	
Version 3.0.2 2015-03-13
* make pretty-printing method names consistent
* improvements to Portuguese stemmer
* transition-based dependency parsers
* dependency graph visualisation for ipython notebook
* interfaces for Senna, BLLIP, python-crfsuite
* NKJP corpus reader
* code clean ups, minor bug fixes

Thanks to the following contributors to 3.0.2:
	
Long Duong, Saimadhav Heblikar, Helder, Mikhail Korobov, Denis Krusko,
Alex Louden, Felipe Madrigal, David McClosky, Dmitrijs Milajevs,
Ondrej Platek, Nathan Schneider, Dávid Márk Nemeskey, 0ssifrage, ducki13, kiwipi.

Version 3.0.1 2015-01-12
* fix setup.py for new version of setuptools
	
Version 3.0.0 2014-09-07
* minor bugfixes
* added phrase extraction code by Liling Tan and Fredrik Hedman

Thanks to the following contributors to 3.0.0:
Mark Amery, Ivan Barria, Ingolf Becker, Francis Bond, Lars
Buitinck, Cristian Capdevila, Arthur Darcet, Michelle Fullwood,
Dan Garrette, Dougal Graham, Dan Garrette, Dougal Graham, Lauri
Hallila, Tyler Hartley, Fredrik Hedman, Ofer Helman, Bruce Hill,
Marcus Huderle, Nancy Ide, Nick Johnson, Angelos Katharopoulos,
Ewan Klein, Mikhail Korobov, Chris Liechti, Peter Ljunglof,
Joseph Lynch, Haejoong Lee, Peter Ljunglöf, Dean Malmgren, Rob
Malouf, Thorsten Marek, Dmitrijs Milajevs, Shari A’aidil
Nasruddin, Lance Nathan, Joel Nothman, Alireza Nourian, Alexander
Oleynikov, Ted Pedersen, Jacob Perkins, Will Roberts, Alex
Rudnick, Nathan Schneider, Geraldine Sim Wei Ying, Lynn Soe,
Liling Tan, Louis Tiao, Marcus Uneson, Yu Usami, Steven Xu, Zhe
Wang, Chuck Wooters, lade, isnowfy, onesandzeros, pquentin, wvanlint

Version 3.0b2 2014-08-21
* minor bugfixes and clean-ups
* renamed remaining parse_ methods to read_ or load_, cf issue #656
* added Paice's method of evaluating stemming algorithms

Thanks to the following contributors to 3.0.0b2: Lars Buitinck,
Cristian Capdevila, Lauri Hallila, Ofer Helman, Dmitrijs Milajevs,
lade, Liling Tan, Steven Xu

Version 3.0.0b1 2014-07-11
* Added SentiWordNet corpus and corpus reader
* Fixed support for 10-column dependency file format
* Changed Tree initialization to use fromstring

Thanks to the following contributors to 3.0b1: Mark Amery, Ivan
Barria, Ingolf Becker, Francis Bond, Lars Buitinck, Arthur Darcet,
Michelle Fullwood, Dan Garrette, Dougal Graham, Dan Garrette, Dougal
Graham, Tyler Hartley, Ofer Helman, Bruce Hill, Marcus Huderle, Nancy
Ide, Nick Johnson, Angelos Katharopoulos, Ewan Klein, Mikhail Korobov,
Chris Liechti, Peter Ljunglof, Joseph Lynch, Haejoong Lee, Peter
Ljunglöf, Dean Malmgren, Rob Malouf, Thorsten Marek, Dmitrijs
Milajevs, Shari A’aidil Nasruddin, Lance Nathan, Joel Nothman, Alireza
Nourian, Alexander Oleynikov, Ted Pedersen, Jacob Perkins, Will
Roberts, Alex Rudnick, Nathan Schneider, Geraldine Sim Wei Ying, Lynn
Soe, Liling Tan, Louis Tiao, Marcus Uneson, Yu Usami, Steven Xu, Zhe
Wang, Chuck Wooters, isnowfy, onesandzeros, pquentin, wvanlint
	
Version 3.0a4 2014-05-25
* IBM Models 1-3, BLEU, Gale-Church aligner
* Lesk algorithm for WSD
* Open Multilingual WordNet
* New implementation of Brill Tagger
* Extend BNCCorpusReader to parse the whole BNC
* MASC Tagged Corpus and corpus reader
* Interface to Stanford Parser
* Code speed-ups and clean-ups
* API standardisation, including fromstring method for many objects
* Improved regression testing setup
* Removed PyYAML dependency

Thanks to the following contributors to 3.0a4:
Ivan Barria, Ingolf Becker, Francis Bond, Arthur Darcet, Dan Garrette,
Ofer Helman, Dougal Graham, Nancy Ide, Ewan Klein, Mikhail Korobov,
Chris Liechti, Peter Ljunglof, Joseph Lynch, Rob Malouf, Thorsten Marek,
Dmitrijs Milajevs, Shari A’aidil Nasruddin, Lance Nathan, Joel Nothman,
Jacob Perkins, Lynn Soe, Liling Tan, Louis Tiao, Marcus Uneson, Steven Xu,
Geraldine Sim Wei Ying
   2014-02-22 03:58:30 by Thomas Klausner | Files touched by this commit (1)
Log message:
Add another script that needs interpreter replacement.

   2014-02-22 03:57:09 by Thomas Klausner | Files touched by this commit (1)
Log message:
Fix MASTER_SITES.
   2014-02-22 03:56:07 by Thomas Klausner | Files touched by this commit (1)
Log message:
Sort.

   2014-02-22 00:53:55 by Hiramatsu Yoshifumi | Files touched by this commit (3) | Package updated
Log message:
Update py-nltk to 3.0a3.

Changes from previous:
----------------------
Version 3.0a3 2013-11-02
* support for FrameNet contributed by Chuck Wooters
* support for Universal Declaration of Human Rights Corpus (udhr2)
* major API changes:
  - Tree.node -> Tree.label() / Tree.set_label()
  - Chunk parser: top_node -> root_label; chunk_node -> chunk_label
  - WordNet properties are now access methods, e.g. Synset.definition -> \ 
Synset.definition()
  - relextract: show_raw_rtuple() -> rtuple(), show_clause() -> clause()
* bugfix in texttiling
* replaced simplify_tags with support for universal tagset (simplify_tags=True \ 
-> tagset='universal')
* Punkt default behavior changed to realign sentence boundaries after trailing \ 
parenthesis and quotes
* deprecated classify.svm (use scikit-learn instead)
* various efficiency improvements
   2014-01-25 11:38:08 by Thomas Klausner | Files touched by this commit (171) | Package updated
Log message:
Mark packages as not ready for python-3.x where applicable;
either because they themselves are not ready or because a
dependency isn't. This is annotated by
PYTHON_VERSIONS_INCOMPATIBLE=  33 # not yet ported as of x.y.z
or
PYTHON_VERSIONS_INCOMPATIBLE=  33 # py-foo, py-bar
respectively, please use the same style for other packages,
and check during updates.

Use versioned_dependencies.mk where applicable.
Use REPLACE_PYTHON instead of handcoded alternatives, where applicable.
Reorder Makefile sections into standard order, where applicable.

Remove PYTHON_VERSIONS_INCLUDE_3X lines since that will be default
with the next commit.

Whitespace cleanups and other nits corrected, where necessary.
   2013-09-01 20:16:31 by Kamel Derouiche | Files touched by this commit (3) | Package updated
Log message:
UPDATE VERSION

   2012-10-07 14:25:28 by Aleksej Saushev | Files touched by this commit (17)
Log message:
Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.
Mark packages that don't or might probably not have staged installation.