./graphics/tesseract, Open Source OCR Engine

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 5.3.4, Package name: tesseract-5.3.4, Maintainer: pkgsrc-users

Tesseract provides an OCR engine and a command line program. It
includes a new neural net (LSTM) based OCR engine which is focused on
line recognition, but also still provides a legacy OCR engine which
works by recognizing character patterns. Tesseract has Unicode (UTF-8)
support, and can recognize more than 100 languages "out of the box".
Tesseract can be trained to recognize other languages. It supports
various output formats: plain text, hOCR (HTML), PDF,
invisible-text-only PDF, and TSV.


Required to run:
[textproc/icu] [graphics/cairo] [devel/pango] [graphics/leptonica]

Required to build:
[textproc/asciidoc] [pkgtools/x11-links] [x11/xcb-proto] [pkgtools/cwrappers] [x11/xorgproto]

Master sites:

Filesize: 1873.358 KB

Version history: (Expand)


CVS history: (Expand)


   2024-01-19 16:17:49 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
tesseract: updated to 5.3.4

5.3.4

Fixes for autoconf, clang and sw builds
Send output of combine_tessdata -d to stdout instead of stderr. Fixes #4149
Move bail_out function before libtoolize check
Improve OCR for an image URL
Fail on curl download errors
Add new parameter curl_cookiefile for curl_easy_setopt
Set User-Agent: header field in HTTP request for curl downloads
Force TCP v4 for socket to ScrollView server. Fixes #3000
Fix some compiler warnings and avoid unnecessary conversions from std::string to \ 
char pointer
Fix a tiny typo in publictypes.h
Fixes for autoconf, clang and sw builds
Other small improvements for code and documentation.
   2023-11-14 15:03:25 by Thomas Klausner | Files touched by this commit (1145)
Log message:
*: recursive bump for cairo dependency changes
   2023-11-12 14:24:43 by Thomas Klausner | Files touched by this commit (2570)
Log message:
*: revebump for new brotli option for freetype2

Addresses PR 57693
   2023-11-08 14:21:43 by Thomas Klausner | Files touched by this commit (2377)
Log message:
*: recursive bump for icu 74.1
   2023-10-21 19:11:59 by Greg Troxel | Files touched by this commit (1345) | Package updated
Log message:
recursive revbump for tiff update
   2023-10-09 11:40:21 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
tesseract: updated to 5.3.3

5.3.3

Disable -mfpu=neon for aarch64
Fix build without git clone in cloned directory
Fix some issues which were reported by Coverity Scan
Update ScrollView.java
Fix some code comments
Optimize function ImageFind::FindImages
Rename BibTex file to please GitHub
Fix Broken URLs in citations.bib
initDSProfile: correct std::vector usage
Fix typo in stepblob.h
Fix regression in layout detection since 5.0.0
Update ScrollView.java
Fix loading of sublangs (regression)
   2023-07-18 20:19:24 by Nia Alarie | Files touched by this commit (17)
Log message:
graphics: Adapt packages to USE_(CC|CXX)_FEATURES
   2023-07-17 21:33:04 by Adam Ciarcinski | Files touched by this commit (2) | Package updated
Log message:
tesseract: updated to 5.3.2

5.3.2

fix: Fix snap package building
Support for Sgaw and W Pwo Karen languages in the Myanmar validator.
Replace bool array by more compact vector
Replace deprecated sprintf
Improve format of logging from lstmtraining
Clean code
Abort with error message if OSD is requested with LSTM-only model
Fix typos