./textproc/sgrep, Tool for searching and indexing text, SGML,XML and HTML files

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.94a, Package name: sgrep-1.94a, Maintainer: pkgsrc-users

sgrep (structured grep) is a tool for searching and indexing text,
SGML,XML and HTML files and filtering text streams using structural
criteria. The data model of sgrep is based on regions, which are
nonempty substrings of text. Regions are typically occurrences of
constant strings, SGML-tags, or meaningful text elements, which
are recognizable through some delimiting strings or the builtin
SGML, XML and HTML parser. Regions can be arbitrarily long,
arbitrarily overlapping, and arbitrarily nested.

Sgrep is a convenient tool for making queries to almost any kind
of text files with some well kown structure. These include programs,
mail folders, news folders, HTML, SGML, etc... With relatively
simple queries you can display mail messages by their subject or
sender, extract titles or links or any regions from HTML files,
function prototypes from C or make complex queries to SGML files
based on the DTD of the file.


Required to build:
[pkgtools/cwrappers]

Master sites:

Filesize: 188.737 KB

Version history: (Expand)


CVS history: (Expand)


   2021-10-26 13:23:42 by Nia Alarie | Files touched by this commit (1161)
Log message:
textproc: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Unfetchable distfiles (fetched conditionally?):
./textproc/convertlit/distinfo clit18src.zip
   2021-10-07 17:02:49 by Nia Alarie | Files touched by this commit (1162)
Log message:
textproc: Remove SHA1 hashes for distfiles
   2021-04-24 12:38:37 by Thomas Klausner | Files touched by this commit (1)
Log message:
sgrep: remove dead download link
   2020-01-26 18:32:28 by Roland Illig | Files touched by this commit (981)
Log message:
all: migrate homepages from http to https

pkglint -r --network --only "migrate"

As a side-effect of migrating the homepages, pkglint also fixed a few
indentations in unrelated lines. These and the new homepages have been
checked manually.
   2019-11-04 22:43:49 by Roland Illig | Files touched by this commit (155)
Log message:
textproc: align variable assignments

pkglint -Wall -F --only aligned --only indent -r

No manual corrections.
   2015-11-04 03:00:17 by Alistair G. Crooks | Files touched by this commit (797)
Log message:
Add SHA512 digests for distfiles for textproc category

Problems found locating distfiles:
	Package cabocha: missing distfile cabocha-0.68.tar.bz2
	Package convertlit: missing distfile clit18src.zip
	Package php-enchant: missing distfile php-enchant/enchant-1.1.0.tgz

Otherwise, existing SHA1 digests verified and found to be the same on
the machine holding the existing distfiles (morden).  All existing
SHA1 digests retained for now as an audit trail.
   2012-10-25 08:57:09 by Aleksej Saushev | Files touched by this commit (587)
Log message:
Drop superfluous PKG_DESTDIR_SUPPORT, "user-destdir" is default these days.
   2010-04-14 22:19:28 by Thomas Klausner | Files touched by this commit (4) | Imported package
Log message:
Initial import of sgrep:

sgrep (structured grep) is a tool for searching and indexing text,
SGML,XML and HTML files and filtering text streams using structural
criteria. The data model of sgrep is based on regions, which are
nonempty substrings of text. Regions are typically occurrences of
constant strings, SGML-tags, or meaningful text elements, which
are recognizable through some delimiting strings or the builtin
SGML, XML and HTML parser. Regions can be arbitrarily long,
arbitrarily overlapping, and arbitrarily nested.

Sgrep is a convenient tool for making queries to almost any kind
of text files with some well kown structure. These include programs,
mail folders, news folders, HTML, SGML, etc... With relatively
simple queries you can display mail messages by their subject or
sender, extract titles or links or any regions from HTML files,
function prototypes from C or make complex queries to SGML files
based on the DTD of the file.