./lang/nawk, Brian Kernighans pattern-directed scanning and processing language

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 20230909, Package name: nawk-20230909, Maintainer: pkgsrc-users

The one, true implementation of the AWK pattern-directed scanning and
processing language, by one of the language's creators, Brian Kernighan.
This is the version of awk described in "The AWK Programming Language",
by Al Aho, Brian Kernighan, and Peter Weinberger (Addison-Wesley,
1988, ISBN 0-201-07981-X). It is also known as new awk, or nawk.

Required to build:

Version history: (Expand)

CVS history: (Expand)

   2023-09-17 12:32:06 by Paolo Vincenzo Olivo | Files touched by this commit (13)
Log message:
lang/nawk: downgrade to 20230909.

Partially revert previous commit, by downgrading the package to the
most recent release tag supporting ASCII encoded input files (and
processing strings as sequences of bytes).
This is needed by security/mozilla-rootcerts and likely other packages;
see https://mail-index.netbsd.org/tech-pkg/2023/09/17/msg028190.html.

This version incorporates all the changes described in the FIXES file up
to 2023-09-09, minus support for UTF-8 and comma-separated values (CSV)
   2023-09-12 21:16:52 by Paolo Vincenzo Olivo | Files touched by this commit (18) | Package updated
Log message:
lang/nawk: update to release 20230911

This release marks the official 2nd edition of the AWK programming

# CHANGES (since 20220122)

Sep 11, 2023:
	Added --csv option to enable processing of comma-separated
	values inputs.  When --csv is enabled, fields are separated
	by commas, fields may be quoted with " double quotes, fields
	may contain embedded newlines.

	If no explicit separator argument is provided, split() uses
	the setting of --csv to determine how fields are split.

	Strings may now contain UTF-8 code points (not necessarily
	characters).  Functions that operate on characters, like
	length, substr, index, match, etc., use UTF-8, so the length
	of a string of 3 emojis is 3, not 12 as it would be if bytes
	were counted.

	Regular expressions are processes as UTF-8.

	Unicode literals can be written as \u followed by one
	to eight hexadecimal digits.  These may appear in strings and
	regular expressions.

Sep 06, 2023:
	Fix edge case where FS is changed on commandline. Thanks to
	Gordon Shephard and Miguel Pineiro Jr.

	Fix regular expression clobbering in the lexer, where lexer does
	not make a copy of regexp literals. also makedfa memory leaks have
	been plugged. Thanks to Miguel Pineiro Jr.

Dec 15, 2022:
	Force hex escapes in strings to be no more than two characters,
	as they already are in regular expressions. This brings internal
	consistency, as well as consistency with gawk. Thanks to
	Arnold Robbins.

Sep 12, 2022:
	adjbuf minlen error (cannot be 0) in cat, resulting in NULL pbuf.
	discovered by todd miller. also use-after-free issue with
	tempfree in cat, thanks to Miguel Pineiro Jr and valgrind.

Aug 30, 2022:
	Various leaks and use-after-free issues plugged/fixed.
	Thanks to Miguel Pineiro Jr. <mpj@pineiro.cc>.

May 23, 2022:
	Memory leak when assigning a string to some of the built-in
	variables. allocated string erroneously marked DONTFREE.
	Thanks to Miguel Pineiro Jr. <mpj@pineiro.cc>.

Mar 14, 2022:
	Historic bug: command-line "name=value" assignment had been
	truncating its entry in ARGV. (circa 1989) Thanks to
	Miguel Pineiro Jr. <mpj@pineiro.cc>.

Mar 3, 2022:
	Fixed file management memory leak that appears to have been
	there since the files array was first initialized with stdin,
	stdout, and stderr (circa 1992). Thanks to Miguel Pineiro Jr.
   2023-02-24 21:57:50 by Paolo Vincenzo Olivo | Files touched by this commit (24) | Package updated
Log message:
lang/nawk: update to release 20220122.


2020-07-30         Arnold D. Robbins     <arnold@skeeve.com>

	By fiat, we use bison for $(YACC). Trying to accommodate
	different versions didn't work.

	* makefile: Significant cleanup. Replace all ytab* references
	with awkgram.tab.* and simplify definition of YACC.
	* .gitignore: Remove ytab* references.
	* b.c, lex.c, maketab.c, parse.c, run.c: Replace include of ytab.h
	with awkgram.tab.h.
	* lib.c, main.c, tran.c: Remove include of ytab.h, wasn't needed.

2020-01-20         Arnold D. Robbins     <arnold@skeeve.com>

	* run.c (openfile): Set the close-on-exec flag for file
	and pipe redirections that aren't stdin/stdout/stderr.

2020-01-06         Arnold D. Robbins     <arnold@skeeve.com>

	Minor fixes.
	* b.c (replace_repeat): Turn init_q back into an int.
	* lex.c (string): Use \a instead of \007.
	* tran.c (catstr): Use snprintf instead of sprintf.

2020-01-01         Arnold D. Robbins     <arnold@skeeve.com>

	* tran.c (syminit, arginit, envinit): Free sval member before
	setting it. Thanks to valgrind.
	* b.c: Small formatting cleanups in several routines.

2019-12-27         Arnold D. Robbins     <arnold@skeeve.com>

	* b.c (replace_repeat): Fix a bug whereby a{0,3} could match
	four a's.  Thanks to Anonymous AWK fan <awkfan77@mailfence.com>
	for the report. Also, minor code formatting cleanups.
	* testdir/T.int-expr: New file.

2019-12-11         Arnold D. Robbins     <arnold@skeeve.com>

	* README: Renamed to ...
	* README.md: ... this. Cleaned up some as well,
	including moving to Markdown.

2019-11-08         Arnold D. Robbins     <arnold@skeeve.com>

	* test/T.chem: Use $oldawk instead of hardwiring 'awk'.
	* test/T.lilly: Remove gawk warnings from output, improves

2019-10-07         Arnold D. Robbins     <arnold@skeeve.com>

	* b.c (fnematch): Change type of pbuf from unsigned char to char.
	* proto.h (fnematch): Ditto.

2019-10-06         Arnold D. Robbins     <arnold@skeeve.com>

	* lib.c (readrec): Allow RS a regular expression. Imported
	the code from the NetBSD awk.
	* b.c (fnematch): New function for implementing the feature.
	* awk.1: Updated.
	* main.c (version): Updated.

2019-06-24         Arnold D. Robbins     <arnold@skeeve.com>

	* makefile: Revise to take into account there is no more awktest.tar,
	add targets 'check' and 'test', and also 'testclean' to clean up
	after test run.  Have 'clean' and 'cleaner' depend upon 'testclean'.

2019-06-23         Arnold D. Robbins     <arnold@skeeve.com>

	* testdir: Extracted from awktest.tar and added to Git.
	* awktest.tar: Removed.

2019-06-06         Arnold D. Robbins     <arnold@skeeve.com>

	* awk.1: Fix a typo, minor edits.

2019-06-05         Arnold D. Robbins     <arnold@skeeve.com>

	* b.c (relex): Count parentheses and treat umatched right paren
	as a literal character.
	* awktest.tar (testdir/T.re): Added a test case.
	* main.c (version): Updated.

2019-05-29         Arnold D. Robbins     <arnold@skeeve.com>

	* lib.c (isclvar): Remove check for additional '=' after
	first one. No longer needed.

2019-01-26         Arnold D. Robbins     <arnold@skeeve.com>

	* main.c (version): Updated.

2019-01-25         Arnold D. Robbins     <arnold@skeeve.com>

	* run.c (awkgetline): Check for numeric value in all getline
	variants. See the numeric-getline.* files in bugs-fixed directory.

2018-08-29         Arnold D. Robbins     <arnold@skeeve.com>

	* REGRESS: Check for existence of a.out. If not there, run
	make.  Enable core dumps for T.arnold system status test
	to work on MacOS X.

2018-08-22         Arnold D. Robbins     <arnold@skeeve.com>

	* awktest.tar (testdir/T.expr): Fix test for unary plus.

2018-08-22         Arnold D. Robbins     <arnold@skeeve.com>

	* REGRESS: Extract tests if necessary, set PATH to include '.'.
	* regdir/beebe.tar (Makefile): Fix longwrds test to prefix
	sort with LC_ALL=C.
	* awktest.tar: Updated from fixed test suite, directory
	it extracts is now called 'testdir' to match what's in top-level
	REGRESS script.
	* regdir: Removed, as Brian wants to keep the test suite in
	the tar file.

2018-08-22         Arnold D. Robbins     <arnold@skeeve.com>

	* FIXES, lib.c, run.c, makefile, main.c: Merge from Brian's tree.
	* REGRESS: New file, from Brian.
	* awktest.tar: Restored from Brian's tree.

2018-08-22         Arnold D. Robbins     <arnold@skeeve.com>

	* awkgram.y (UPLUS): New token. In the grammar, call op1()
	with it.
	* maketab.c (proc): Add entry for UPLUS.
	* run.c (arith): Handle UPLUS.
	* main.c (version): Updated.
	* bugs-fixed/unary-plus.awk, bugs-fixed/unary-plus.bad,
	bugs-fixed/unary-plus.ok: New files.

2018-08-10         Arnold D. Robbins     <arnold@skeeve.com>

	* TODO: Updated.
	* awk.1: Improve use of macros, add some additional explanation
	in a few places, alphabetize list of variables.

2018-08-08         Arnold D. Robbins     <arnold@skeeve.com>

	* awk.h (Cell): Add new field `fmt' to track xFMT value used
	for a string conversion.
	[CONVC, CONVO]: New flag macros.
	* bugs-fixed/README: Updated.
	* bugs-fixed/string-conv.awk, bugs-fixed/string-conv.bad,
	bugs-fixed/string-conv.ok: New files.
	* main.c (version): Updated.
	* proto.h (flags2str): Add declaration.
	* tran.c (setfval): Clear CONVC and CONVO flags and set vp->fmt
	to NULL.
	(setsval): Ditto. Add large comment and new code to manage
	correct conversion of number to string based on various flags
	and the value of vp->fmt. The idea is to not convert again
	if xFMT is the same as before and we're doing the same conversion.
	Otherwise, clear the old flags, set the new, and reconvert.
	(flags2str): New function. For debug prints and for use from a debugger.

2018-08-05         Arnold D. Robbins     <arnold@skeeve.com>

	Fix filename conflicts in regdir where the only difference was
	in letter case. This caused problems on Windows systems.

	* regdir/Compare.T1: Renamed from regdir/Compare.T.
	* regdir/t.delete0: Renamed from regdir/t.delete.
	* regdir/t.getline1: Renamed from regdir/t.getline.
	* regdir/t.redir1: Renamed from regdir/t.redir.
	* regdir/t.split1: Renamed from regdir/t.split.
	* regdir/t.sub0: Renamed from regdir/t.sub.
	* regdir/REGRESS: Adjusted.

2018-08-04         Arnold D. Robbins     <arnold@skeeve.com>

	With scalpel, tweasers, magnifying glass and bated breath,
	borrow code from the NetBSD version of nawk to fix the years-old
	bug whereby decrementing the value of NF did not change the

	* lib.c (fldbld): Set donerec to 1 when done.
	(setlastfld): New function.
	* proto.h (setlastfld): Add declaration.
	* run.c (copycell): Make code smarter about flags (from NetBSD code).
	* tran.c (setfree): New function.
	* tran.c (setfval): Normalize negative zero to positive zero.
	If setting NF, clear donerec and call setlastfld().
	(setsval): Remove call to save_old_OFS().  If setting OFS, call
	recbld(). If setting NF, clear donerec and call setlastfld().

	As part of the process, revert OFS-related changes of 2018-05-22:

	* awk.h (saveOFS, saveOFSlen, save_old_OFS): Remove declarations.
	* lib.c (recbld): Use *OFS instead of saveOFS.
	* run.c (saveOFS, saveOFSlen, save_old_OFS): Remove.
	* tran.c (syminit): Remove initialization of saveOFS and saveOFSlen.

	General stuff that goes along with all this:

	* bugs-fixed/README: Updated.
	* bugs-fixed/decr-NF.awk, bugs-fixed/decr-NF.bad,
	bugs-fixed/decr-NF.ok: New files.
	* main.c (version): Updated.
	* regdir/README.TESTS: Fix awk book title.
	* regdir/T.misc: Revise test to match fixed code.
	* run.c (format): Increase size of buffer used for %a test. (Unrelated
	to NF or OFS, but fixes a compiler complaint.)

2018-06-07         Arnold D. Robbins     <arnold@skeeve.com>

	* regdir/beebe.tar: Fix longwrds.ok so that the test will pass.
	The file was incorrectly sorted.

2018-06-06         Arnold D. Robbins     <arnold@skeeve.com>

	* regdir/T.lilly: Fix the bug again in the second instance
	of the code. Thanks to BWK for pointing this out.

2018-05-31         Arnold D. Robbins     <arnold@skeeve.com>

	* regdir/T.lilly: Fix a syntax error and ordering bug
	in creating the 'foo' file.

2018-05-23         Arnold D. Robbins     <arnold@skeeve.com>

	* awk.1: Remove standalone 'awk' at the top of file, it messed up
	the formatting. Arrange built-in variable list in alphabetical

2018-05-23         Arnold D. Robbins     <arnold@skeeve.com>

	* main.c (version): Add my email address and a date so that
	users can tell this isn't straight BWK awk.
	* README.md: Minor updates.
	* TODO: Updated.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	Add POSIX-required formats %a and %A.

	* run.c (format): Check for %a support in C library. If there,
	allow %a and %A as valid formats.
	* TODO: Updated.
	* bugs-fixed/README: Updated.
	* bugs-fixed/a-format.awk, bugs-fixed/a-format.bad,
	bugs-fixed/a-format.ok: New files.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* FIXES: Restored a line from a much earlier version that
	apparently got lost when the dates were reordered.
	* TODO: Updated.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* README.md: New file.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* regdir/echo.c, regdir/time.c: Minor fixes to compile without
	warning on current GCC / Linux.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* TODO: New file.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* makefile (gitadd, gitpush): Remove these targets. They
	should not be automated and were incorrect for things that
	would be done regularly.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	Fix nawk so that [[:blank:]] only matches space and tab instead
	of any whitespace character, originally made May 10, 2018.
	See bugs-fixed/space.awk.

	This appears to have been a thinko on Brian's part.

	* b.c (charclasses): Use xisblank() function for [[:blank:]].
	* bugs-fixed/README: Updated.
	* bugs-fixed/space.awk, bugs-fixed/space.bad,
	bugs-fixed/space.ok: New files.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* .gitignore: New file.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	Fix nawk to provide reasonable exit status for system(),
	a la gawk, originally made March 12, 2016. See

	* run.c (bltin): For FSYSTEM, use the macros defined for wait(2)
	to produce a reasonable exit value, instead of doing a floating-point
	division by 256.
	* awk.1: Document the return status values.
	* bugs-fixed/README: Updated.
	* bugs-fixed/system-status.awk, bugs-fixed/system-status.bad,
	bugs-fixed/system-status.ok: New files.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	Bug fix with respect to rebuilding a record, originally
	made August 19, 2014. See bugs-fixed/ofs-rebuild.awk.

	* awk.h (saveOFS, saveOFSlen): Declare new variables.
	* lib.c (recbld): Use them when rebuilding the record.
	* run.c (saveOFS, saveOFSlen): Define new variables.
	(save_old_OFS): New function to save OFS aside.
	* tran.c (syminit): Initialize saveOFS and saveOFSlen.
	(setsval): If setting a field, call save_old_OFS().
	* bugs-fixed/README, bugs-fixed/ofs-rebuild.awk,
	bugs-fixed/ofs-rebuild.bad, bugs-fixed/ofs-rebuild.ok: New files.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* makefile (YACC): Use bison.

2018-05-22         Arnold D. Robbins     <arnold@skeeve.com>

	* ChangeLog: Created.
	* regdir: Created. Based on contents of awktest.a.
	* .gitattributes: Created, to preserve CR LF in regdir/t.crlf.
	* awktest.a: Removed.
	* regdir/T.gawk, regdir/T.latin1: Updated from awktest.tar.
	* awktest.tar: Removed.
   2020-01-26 18:32:28 by Roland Illig | Files touched by this commit (981)
Log message:
all: migrate homepages from http to https

pkglint -r --network --only "migrate"

As a side-effect of migrating the homepages, pkglint also fixed a few
indentations in unrelated lines. These and the new homepages have been
checked manually.
   2015-12-17 22:27:53 by David A. Holland | Files touched by this commit (1)
Log message:
Don't use __attribute__((__noreturn__)) without compiler guards.
should fix (or at least improve) bootstrap on DU/Tru64 with the
DEC/Compaq compiler.
   2014-10-09 16:07:17 by Thomas Klausner | Files touched by this commit (1163)
Log message:
Remove pkgviews: don't set PKG_INSTALLATION_TYPES in Makefiles.
   2014-03-25 13:50:49 by Jonathan Perkin | Files touched by this commit (1)
Log message:
Increase the default YYMAXDEPTH from 150 to 300, fixes problems building
devel/editline where mdoc2man.awk would previously abort with a stack
overflow.  This is still pretty conservative compared to other parsers.

   2014-03-12 15:20:43 by Ryo ONODERA | Files touched by this commit (18)
Log message:
Update to 20121220

* Works fine under Debian GNU/Linux 7.4, NetBSD/amd64 6.99.36
* Merge pkgsrc specific changes

Dec 20, 2012:
	fiddled makefile to get correct yacc and bison flags.  pick yacc
	(linux) or bison (mac) as necessary.

	added  __attribute__((__noreturn__)) to a couple of lines in
	proto.h, to silence someone's enthusiastic checker.

	fixed obscure call by value bug in split(a[1],a) reported on
	9fans.  the management of temporary values is just a mess; i
	took a shortcut by making an extra string copy.  thanks
	to paul patience and arnold robbins for passing it on and for
	proposed patches.

	tiny fiddle in setfval to eliminate -0 results in T.expr, which
	has irritated me for 20+ years.

Aug 10, 2011:
	another fix to avoid core dump with delete(ARGV); again, many thanks
	to ruslan ermilov.

Aug 7, 2011:
	split(s, a, //) now behaves the same as split(s, a, "")

Jun 12, 2011:
	/pat/, \n /pat/ {...} is now legal, though bad style to use.

	added checks to new -v code that permits -vnospace; thanks to
	ruslan ermilov for spotting this and providing the patch.

	removed fixed limit on number of open files; thanks to aleksey
	cheusov and christos zoulos.

	fixed day 1 bug that resurrected deleted elements of ARGV when
	used as filenames (in lib.c).

	minor type fiddles to make gcc -Wall -pedantic happier (but not
	totally so); turned on -fno-strict-aliasing in makefile.

May 6, 2011:
	added #ifdef for isblank.
	now allows -ffoo as well as -f foo arguments.
	(thanks, ruslan)

May 1, 2011:
	after advice from todd miller, kevin lo, ruslan ermilov,
	and arnold robbins, changed srand() to return the previous
	seed (which is 1 on the first call of srand).  the seed is
	an Awkfloat internally though converted to unsigned int to
	pass to the library srand().  thanks, everyone.

	fixed a subtle (and i hope low-probability) overflow error
	in fldbld, by adding space for one extra \0.  thanks to
	robert bassett for spotting this one and providing a fix.

	removed the files related to compilation on windows.  i no
	longer have anything like a current windows environment, so
	i can't test any of it.

May 23, 2010:
	fixed long-standing overflow bug in run.c; many thanks to
	nelson beebe for spotting it and providing the fix.

	fixed bug that didn't parse -vd=1 properly; thanks to santiago
	vila for spotting it.

Feb 8, 2010:
	i give up.  replaced isblank with isspace in b.c; there are
	no consistent header files.

Nov 26, 2009:
	fixed a long-standing issue with when FS takes effect.  a
	change to FS is now noticed immediately for subsequent splits.

	changed the name getline() to awkgetline() to avoid yet another
	name conflict somewhere.

Feb 11, 2009:
	temporarily for now defined HAS_ISBLANK, since that seems to
	be the best way through the thicket.  isblank arrived in C99,
	but seems to be arriving at different systems at different

Oct 8, 2008:
	fixed typo in b.c that set tmpvec wrongly.  no one had ever
	run into the problem, apparently.  thanks to alistair crooks.

Oct 23, 2007:
	minor fix in lib.c: increase inputFS to 100, change malloc
	for fields to n+1.

	fixed memory fault caused by out of order test in setsval.

	thanks to david o'brien, freebsd, for both fixes.

May 1, 2007:
	fiddle in makefile to fix for BSD make; thanks to igor sobrado.

Mar 31, 2007:
	fixed some null pointer refs calling adjbuf.

Feb 21, 2007:
	fixed a bug in matching the null RE in sub and gsub.  thanks to al aho
	who actually did the fix (in b.c), and to wolfgang seeberg for finding
	it and providing a very compact test case.

	fixed quotation in b.c; thanks to Hal Pratt and the Princeton Dante

	removed some no-effect asserts in run.c.

	fiddled maketab.c to not complain about bison-generated values.

	removed the obsolete -V argument; fixed --version to print the
	version and exit.

	fixed wording and an outright error in the usage message; thanks to igor
	sobrado and jason mcintyre.

	fixed a bug in -d that caused core dump if no program followed.

Jan 1, 2007:
	dropped mac.code from makefile; there are few non-MacOSX
	mac's these days.

Jan 17, 2006:
	system() not flagged as unsafe in the unadvertised -safe option.
	found it while enhancing tests before shipping the ;login: article.
	practice what you preach.

	removed the 9-years-obsolete -mr and -mf flags.

	added -version and --version options.

	core dump on linux with BEGIN {nextfile}, now fixed.

	removed some #ifdef's in run.c and lex.c that appear to no
	longer be necessary.