./graphics/tesseract, Open Source OCR Engine

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: pkgsrc-2018Q4, Version: 4.0.0nb4, Package name: tesseract-4.0.0nb4, Maintainer: pkgsrc-users

This code is a raw OCR engine. It has NO PAGE LAYOUT ANALYSIS, NO
OUTPUT FORMATTING, and NO UI. It can only process an image of a
single column and create text from it. It can detect fixed pitch
vs proportional text. Having said that, in 1995, this engine was
in the top 3 in terms of character accuracy, and it compiles and
runs on both Linux and Windows. Another current limitation is that
it only recognizes English and its character set is only US-ASCII.
Training code IS included in the open source release however, and
will be included in a future release.


Required to run:
[devel/pango]

Required to build:
[pkgtools/cwrappers] [pkgtools/x11-links]

Master sites:

SHA1: 243a4919d44bc64d1e7e4cac660c716c845a8d03
RMD160: 0e95d343639ab98c6d3fbc528053b627b6e12282
Filesize: 1915.402 KB

Version history: (Expand)