Path to this page:
./
graphics/tesseract,
Open Source OCR Engine
Branch: pkgsrc-2021Q2,
Version: 4.1.1nb8,
Package name: tesseract-4.1.1nb8,
Maintainer: pkgsrc-usersTesseract provides an OCR engine and a command line program. It
includes a new neural net (LSTM) based OCR engine which is focused on
line recognition, but also still provides a legacy OCR engine which
works by recognizing character patterns. Tesseract has Unicode (UTF-8)
support, and can recognize more than 100 languages "out of the box".
Tesseract can be trained to recognize other languages. It supports
various output formats: plain text, hOCR (HTML), PDF,
invisible-text-only PDF, and TSV.
Master sites:
SHA1: 25318bb3f57ef72d5736730739451673f4a66f51
RMD160: 8218e0271e7dee72a0f2ee0a8c8ce937b94d857f
Filesize: 1928.699 KB
Version history: (Expand)
- (2021-07-01) Package added to pkgsrc.se, version tesseract-4.1.1nb8 (created)