./textproc/zet, CLI utility to find the union, intersection, etc of files

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.0.0, Package name: zet-1.0.0, Maintainer: pkgsrc-users

This is a command-line utility for doing set operations on files considered as
sets of lines. For instance, `zet union x y z` outputs the lines that occur in
any of `x`, `y`, or `z`.
Two notes:
-Each output line occurs only once, because we're treating the files as sets
and the lines as their elements.
-We do take the file structure into account in one respect: the lines are
output in the same order as they are encountered. So `zet union x` prints
out the lines of `x`, in order, with duplicates removed.


Required to build:
[lang/rust]

Master sites:

Filesize: 39.844 KB

Version history: (Expand)


CVS history: (Expand)


   2023-04-19 22:45:41 by pin | Files touched by this commit (3) | Package updated
Log message:
textproc/zet: update to 1.0.0

1.0.0 - 2023-04-18
Added
  - Add the --count-lines flag to show the number of times each line occurs in
    the input and the --count-files flag to show the number of files each line
    occurs in. The --count flag acts like --count-lines unless --count-files is
    active, in which case it acts like --count-files. The --count-none flag
    turns off counting, and can be used to override the other count flags.
    (In the usual POSIX convention, the last count flag given will override any
    previous count flag.)

Changed
  - Breaking: When - is used as a file argument, zet reads from standard input,
    not the file - in the current directory. (That file can be passed to zet
    as ./-)
  - When no file arguments are given, zet reads from standard input.
  - Breaking: Add the --files (alias --file) flag for the zet single and zet
    multiple commands. The zet single command now outputs lines that occur
    exactly once in the entire input. The zet single --file command reproduces
    the old behavior (output lines that occur in just one file, though possibly
    many times in that one file). Similarly, zet multiple --files reproduces
    the old behavior of requiring output lines to occur in more than one file,
    while zet multiple without the --files flag will output lines that occur
    more than once, even if in just one file.
  - Use clap 4's help format, but clap 3's colors. This is self-indulgent
    recreation of (part of) clap's help feature, because I like the clap 4's
    help format, but really miss the colored (rather than gray-scale) help.
   2023-01-02 13:51:24 by pin | Files touched by this commit (3) | Package updated
Log message:
textproc/zet: update to 0.2.5

[0.2.5] - 2022-11-10
Changed
 - Bump Minimum Supported Rust Version to 1.64.0
 - Switch from failure to anyhow
 - Performance enhancements:
     - Use Cow keys for UnionSet and CountedSet so we can borrow the lines of
       the first file rather than allocating them
     - If line is in a CountedSet, don't allocate a key
     - Use FxHash — averages 10-15% faster on large files
     - Convert Diff and Union to use CowSet
     - Convert Single, Multiple, and Intersect to by-line algorithms
     - No longer create map/set for args after the 1st
 - Refactor and expand internal documentation.
 - Change Single/Multiple code to use a single NonZeroUsize operand ID rather
   than two u32 IDs
   2022-03-29 12:43:38 by pin | Files touched by this commit (1)
Log message:
textproc/zet: dead upstream
   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles
   2021-08-05 10:58:31 by pin | Files touched by this commit (1)
Log message:
textproc/zet: simplify Makefile
   2021-07-05 10:45:08 by pin | Files touched by this commit (3) | Package updated
Log message:
textproc/zet: update to 0.2.0

-Add support for UTF-16 files, and make sure lines that differ only in their
terminator (\n vs \r\n) are considered equal.

-Zet looks for Byte Order Marks in UTF-8, UTF-16LE and UTF-16BE files,
translating UTF-16LE and UTF-16BE to UTF-8. It outputs a (UTF-8) Byte Order Mark
if and only if it finds one in its first file argument.
-Zet strips off the line terminator (\n or \r\n) from each input line. On
output, it uses the line terminator found in the first line of its first file
argument (or \n if the first file consists of a single line with no terminator).
   2021-06-15 09:27:31 by pin | Files touched by this commit (5)
Log message:
textproc/zet: import package

This is a command-line utility for doing set operations on files considered as
sets of lines. For instance, `zet union x y z` outputs the lines that occur in
any of `x`, `y`, or `z`.
Two notes:
-Each output line occurs only once, because we're treating the files as sets
 and the lines as their elements.
-We do take the file structure into account in one respect: the lines are
 output in the same order as they are encountered. So `zet union x` prints
 out the lines of `x`, in order, with duplicates removed.