Subject: CVS commit: pkgsrc/textproc/miller
From: Thomas Klausner
Date: 2016-09-01 18:25:51
Message id: 20160901162551.B3D8BFBC3@cvs.NetBSD.org

Log Message:
Updated miller to 4.5.0.

4.5.0

Customizable output format for redirected output

In a natural follow-on to the 4.4.0 redirected-output feature, the
4.5.0 release allows your tap-files to be in a different output
format from the main program output.

For example, using

mlr --icsv --opprint ... then put --ojson 'tee > \ 
"mytap-".$a.".dat",
$*' then ...

the input is CSV, the output is pretty-print tabular, but the
tee-files output is written in JSON format. Likewise --ofs, --ors,
--ops, --jvstack, and all other output-formatting options from the
main help at mlr -h and/or man mlr default to the main command-line
options, and may be overridden with flags supplied to mlr put and
mlr tee.

4.4.0

Redirected output, row-value shift, and other features

The principal feature of Miller 4.4.0 is redirected output. Inspired
by awk, Miller lets you tap/tee your data as it's processed, run
output through subordinate processes such as gzip and jq, split a
single file into multiple files per an account-ID column, and so
on.

Details:
http://johnkerl.org/miller/doc/reference.html#Redirected-output_statements_for_put

Other features:

    mlr step -a shift allows you to place the previous record's
    values alongside the current record's values:
    http://johnkerl.org/miller/doc/reference.html#step

    mlr head, when used without the group-by flag (-g), stops after
    the specified number of records has been output. For example,
    even with a multi-gigabyte data file, mlr head -n 10 hugefile.dat
    will complete quickly after producing the first ten records
    from the file.

    The sec2gmtdate verb, and sec2gmtdate function for filter/put,
    is new: please see
    http://johnkerl.org/miller/doc/reference.html#sec2gmtdate and
    http://johnkerl.org/miller/doc/reference.html#Functions_for_filter_and_put.

    sec2gmt and sec2gmtdate both leave non-numbers as-is, rather
    than formatting them as (error). This is particularly relevant
    for formatting nullable epoch-seconds columns in SQL-table
    output: if a column value is NULL then after sec2gmt or
    sec2gmtdate it will still be NULL.

    The dot operator has been universalized to work with any data
    type and produce a string. For example, if the field n has
    integers, then instead of typing mlr put '$name = "value:".string($n)'
    you can now simply domlr put '$name = "value:".$n'. This is
    particularly timely for creating filenames for redirected
    print/dump/tee/emit output.

    The online documents now have a copy of the Miller manpage:
    http://johnkerl.org/miller/doc/manpage.html

    Bugfix: inside filter/put, $x=="" was distinct from isempty($x).
    This was nonsensical; now both are the same.

Files:
RevisionActionfile
1.8modifypkgsrc/textproc/miller/Makefile
1.9modifypkgsrc/textproc/miller/distinfo