Path to this page:
./
graphics/tesseract,
Open Source OCR Engine
Branch: pkgsrc-2019Q1,
Version: 4.0.0nb4,
Package name: tesseract-4.0.0nb4,
Maintainer: pkgsrc-usersTesseract provides an OCR engine and a command line program. It
includes a new neural net (LSTM) based OCR engine which is focused on
line recognition, but also still provides a legacy OCR engine which
works by recognizing character patterns. Tesseract has Unicode (UTF-8)
support, and can recognize more than 100 languages "out of the box".
Tesseract can be trained to recognize other languages. It supports
various output formats: plain text, hOCR (HTML), PDF,
invisible-text-only PDF, and TSV.
Required to run:[
textproc/icu] [
graphics/leptonica] [
graphics/cairo] [
devel/pango]
Required to build:[
textproc/asciidoc] [
x11/xcb-proto] [
x11/xorgproto] [
pkgtools/x11-links] [
pkgtools/cwrappers]
Master sites:
SHA1: 243a4919d44bc64d1e7e4cac660c716c845a8d03
RMD160: 0e95d343639ab98c6d3fbc528053b627b6e12282
Filesize: 1915.402 KB
Version history: (Expand)
- (2019-04-11) Package added to pkgsrc.se, version tesseract-4.0.0nb4 (created)