./textproc/swath, Smart Word Analysis for THai

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 0.6.1nb3, Package name: swath-0.6.1nb3, Maintainer: pkgsrc-users

Swath is a general-purpose utility to workaround the lack of such capability
in applications. It analyzes the given Thai text by consulting a Thai word
list for word boundaries, before outputting the same text with the predefined
word delimitors inserted.

It can read many kinds of input, including plain text and structured documents
like HTML, RTF, LaTeX and Lambda (Unicode version of LaTeX with Omega
typesetter kernel).


Required to run:
[devel/libdatrie]

Master sites:


Version history: (Expand)


CVS history: (Expand)


   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles
   2020-08-28 21:35:46 by Sean Cole | Files touched by this commit (3)
Log message:
Update to swath-0.6.1nb3
 - increment version in autoconf patch
   2020-08-28 18:59:43 by Sean Cole | Files touched by this commit (1)
Log message:
Update to swath-0.6.1nb2
- add fix for darwin build and other OS that have wcpcpy.
  configure scripts not working properly for above case.  upstream
  author is aware and hopefully will have a proper fix included in next
  official release
   2020-08-13 18:22:40 by Sean Cole | Files touched by this commit (4)
Log message:
- Use github home URL now
- Github changes from author to fix NetBSD's missing wcpcpy & wcpncpy
   2020-07-30 04:13:56 by Sean Cole | Files touched by this commit (3)
Log message:
doc: Updated textproc/swath to 0.6.1

0.6.1 (2018-08-20)
=====
- Updated word break dictionary.
- Fix a defect in RTF parsing, so RTF gets more complete word break positions.
- Compiler warning fixes.
- Minor code cleanups.
- Useful installation instructions in INSTALL file.
  (Thanks @pepa65 for the pull request.)

0.6.0 (2017-11-28)
=====
- Updated word break dictionary.
- Drop undocumented option '-l'.
- Revamped internal word break engine.
- Updated manpage.

0.5.5 (2016-12-25)
=====
- Updated word break dictionary.

0.5.4 (2016-07-08)
=====
- Updated word break dictionary.
- Fix segfault on extremely long input lines.
- Support longer input lines.
  (Bug report by Santi Romeyen)
- Support non-ASCII word break string.
  https://github.com/tlwg/swath/issues/1
- Some source code clean-ups.
- Add test suite.

0.5.3 (2014-09-01)
=====
- Updated word break dictionary.
- Fix premature output ending on long UTF-8 input line.
  (Bug report by Sorawee Porncharoenwase)
- Fix excessive break positions in plain text mode.
  (Bug report by Sorawee Porncharoenwase)
- Remove dead codes, resulting in a little smaller binary.

0.5.2 (2013-12-23)
=====
- Fix infinite loops in LaTeX filter.
  (Bug report and patch by Neutron Soutmun)
- Fix off-by-one character loss in long HTML tokens.
  (Bug report and analysis by Nicolas Brouard)

0.5.1 (2013-10-30)
=====
- Correct word break code for Lambda.
- Updated word break dictionary.
- Adjust file filters to prevent potential buffer overflow.

0.5.0 (2013-02-11)
=====
- Character encoding conversion is now spontaneous, no more buffering via
  temporary file.
- Rewritten RTF filter. It's now tested to work with real RTF document.
- Process characters as Unicode internally, so that characters not present
  in TIS-620 are not lost in output.
- Fix potential buffer overflow vulnerability in Mule mode.
- Updated word break dictionary.
- Significant source clean-ups.
- Switch to XZ tarball compression.

For pkgsrc, use gmake and add patch to compile wchar functions on NetBSD
   2017-09-04 20:08:31 by Thomas Klausner | Files touched by this commit (163)
Log message:
Follow some redirects.
   2017-01-19 19:52:30 by Alistair G. Crooks | Files touched by this commit (352)
Log message:
Convert all occurrences (353 by my count) of

	MASTER_SITES= 	site1 \
			site2

style continuation lines to be simple repeated

	MASTER_SITES+= site1
	MASTER_SITES+= site2

lines. As previewed on tech-pkg. With thanks to rillig for fixing pkglint
accordingly.