./graphics/tesseract, Open Source OCR Engine

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: pkgsrc-2019Q3, Version: 4.1.0, Package name: tesseract-4.1.0, Maintainer: pkgsrc-users

Tesseract provides an OCR engine and a command line program. It
includes a new neural net (LSTM) based OCR engine which is focused on
line recognition, but also still provides a legacy OCR engine which
works by recognizing character patterns. Tesseract has Unicode (UTF-8)
support, and can recognize more than 100 languages "out of the box".
Tesseract can be trained to recognize other languages. It supports
various output formats: plain text, hOCR (HTML), PDF,
invisible-text-only PDF, and TSV.


Required to run:
[devel/pango] [textproc/icu] [graphics/leptonica] [graphics/cairo]

Required to build:
[textproc/asciidoc] [x11/xorgproto] [x11/xcb-proto] [pkgtools/cwrappers] [pkgtools/x11-links]

Master sites:

SHA1: 6e88cc4fd9f1681142bf74dc2df0559202cff3c2
RMD160: 034ffd9690478e28945c09001ce51f7fdceb2ff5
Filesize: 1918.997 KB

Version history: (Expand)