2021-05-24 21:56:06 by Thomas Klausner | Files touched by this commit (3575) |
Log message:
*: recursive bump for perl 5.34
|
2021-01-14 19:21:01 by Amitai Schleier | Files touched by this commit (2) |
Log message:
Update to 1.4.18. From the changelog:
indexers:
* omindex:
+ Add default MIME mapping for application/rtf. IANA have registrations for
text/rtf and (more recently) application/rtf (it seems because newer
versions of the RTF format can contain 8-bit data) so we now recognise
application/rtf by default and handle it the same way as text/rtf.
Current libmagic seems to always return text/rtf (no matches for
application/rtf in magic.mgc) and we continue to map extension rtf to
text/rtf, so this change is mainly future-proofing against libmagic future
changes.
+ Add support for indexing OpenXPS, which is effectively the same as XPS
internally in ways we care about, but it uses a different mimetype and a
different filename extension.
omega:
* Explicitly use OR for MORELIKE queries.
Since 1.3.0 the default value of DEFAULTOP has been AND, which typically
makes MORELIKE queries much less useful since they'll only match documents
containing all the terms from the query expansion. We now explicitly insert
" OR " between the terms if DEFAULTOP hasn't been set to OR, which \
makes them
work much more like they did in 1.2.x.
* Make $stoplist and $unstem consider all query strings by always passing the
new Xapian::QueryParser::FLAG_ACCUMULATE flag.
* Add $foreach command which works like $map, but just concatenates the
evaluated results rather than adding tabs to turn them into an OmegaScript
list.
* Extend $include{} to allow handling failure to open the specified file via an
optional second argument which if specified will be evaluated and returned
instead. Patch from Gaurav Arora.
* Support multiple MORELIKE parameters - we now form an RSet from all the
specified documents and use that to generate the query to run (previously
only one of multiple MORELIKE parameters was used).
|
2020-08-31 20:13:29 by Thomas Klausner | Files touched by this commit (3631) |
Log message:
*: bump PKGREVISION for perl-5.32.
|
2020-08-21 22:46:05 by Amitai Schleier | Files touched by this commit (1) |
Log message:
Update to 1.4.17. From the changelog:
documentation:
* Document comment format supported by scriptindex index scripts. We've
supported comments on a line by themselves and introduced with a # since
scriptindex was first added back in 2002, but it seems have never actually
been documented before now.
omega:
* Check for SERVER_PROTOCOL=INCLUDED before anything which might throw an
exception so that if it is set we suppress the Content-Type: when reporting
such exceptions. Spotted by Gaurav Arora.
* Report get_description() for Xapian::Error exceptions instead of get_msg().
This means we now report the exception's type, context (useful for network
errors), and errno information.
* Avoid leaking MyStopper object. The object essentially has the lifespan of
omega itself, but becomes unreachable when the QueryParser object is
destroyed. To make it easier to use leak-checking tools, hand ownership of
this object to the QueryParser object.
testsuite:
* omegatest: Tell leak sanitizer not to report leaks for allocations which
aren't explicitly released on exit - the OS will reclaim all memory from the
process at this point and explicitly releasing everything just takes time for
no real benefit. We will still see leaks of objects which become unreachable
during a run.
|
2020-06-10 19:56:10 by Amitai Schleier | Files touched by this commit (1) |
Log message:
Update to 1.4.16. From the changelog:
indexers:
* Fix handling of XML empty tag syntax when there's a quoted parameter right
before the closing `/>`. This caused `<title \
xml:lang="en-US"/>` to treat
the body text as the document title. Spotted by Gaurav Arora.
* omindex: Fix killing of filter child process if the parent process receives a
signal. Spotted by Gaurav Arora.
omega:
* Reject $setrelevant without an argument list. This has never been documented
as allowed, and previously crashed with a segfault. Fixes #802, reported by
Gaurav Arora.
* If there's an error opening the databases we now close any we managed to open
successfully before the error so that things like $dbsize can't end up
reporting values for a subset of the specified databases.
portability:
* Use our own autoconf cache variable namespace (xo_cv_ prefix instead of
ac_cv_) to avoid colliding with standard autoconf macro use if config.site or
a shared config.cache is used. The former case caused a build failure for
the OpenBSD port with 1.4.15, reported by Lucas R.
|
2020-02-25 18:55:47 by Amitai Schleier | Files touched by this commit (2) |
Log message:
Update to 1.4.15. From the changelog:
documentation:
* Update documentation about how to add a new format to omindex. Patch from
Bruno Baruffaldi.
indexers:
* Check for a BOM on HTML files, which for HTML5 should determine the encoding.
omega:
* Allow $if{COND} without any actions which is useful as a way to evaluate
something but ignore the result if you just want the side effects. Indeed
we were already recommending to use it if you want to ignore the return value
of $log. Fixes bug introduced in 1.4.14, reported by tuftedocelot.
* Add OmegaScript support for $jsonbool{COND} for encoding a boolean value for
use in JSON. This is equivalent to $if{COND,true,false} but more readable.
* Add OmegaScript support for $jsonobject{} which allows producing a JSON
object from an OmegaScript map.
* Allow specifying a format to $jsonarray{} so it is no longer restricted to
producing an array of strings.
* Add $keys{MAP} OmegaScript command which gives a sorted list of the keys from
an OmegaScript map.
portability:
* Simplify probes for snprintf. The broken snprintf in libbsd in Linux libc4
is from ~25 years ago so way too ancient to matter now, and all callers
already handle the pre-ISO semantics of returning -1 for an undersize buffer
so we don't need to run a test program to probe for this at configure time,
which is more cross-compile friendly.
* Avoid deprecation warning on recent Linux. We were including sys/sysctl.h if
it existed, which it does on Linux but we don't actually use it there.
Including it now warns that it is deprecated, so skip including it under
Linux. Reported on IRC by kumaran.
|
2019-12-17 04:54:18 by Amitai Schleier | Files touched by this commit (2) |
Log message:
Update to 1.4.14. From the changelog:
documentation:
* Improve omindex --help docs for --duplicates.
* Document that $log will start to return an error message in 1.5.0, and that
one can wrap it using a $if with no action now to be future-proof.
indexers:
* Add built-in support for iso-8859-15 so we can handle it without iconv.
This charset is a variant of iso-8859-1 with 8 characters changed, most
notably including the euro currency symbol. It's the most commonly seen
charset we didn't have built-in support for.
* Optimise converting us-ascii to UTF-8 to do nothing, like we already do when
converting UTF-8 to UTF-8.
* scriptindex:
+ Add new 'gap' action which provides a way to leave a gap in the term
positions between fields to prevent phrases and positional operators from
matching across fields.
omega:
* Fix error handling in $lookup. We now check for errors from cdb_init()
and cdb_get(). We've never checked for errors from cdb_init(), while
for cdb_get() this bug was introduced by a warning fix in 1.2.20.
templates:
* Future-proof use of $log against changes in 1.5.0.
|
2019-08-11 15:25:21 by Thomas Klausner | Files touched by this commit (3557) |
Log message:
Bump PKGREVISIONs for perl 5.30.0
|
2019-08-02 23:29:11 by Amitai Schleier | Files touched by this commit (2) |
Log message:
Update to 1.4.12. From the changelog:
documentation:
* Improve docs for OmegaScript $hitlist{}.
* Fix RST formatting errors in omega docs.
* Clarify use of Q prefix for unique ID terms - it was described as \
"reserved",
but the use of "Q" is really just a convention (and in fact omindex \
uses "U"
not "Q").
* Clarify scriptindex's weight action takes parameter >= 0.
* Correct typo in OmegaScript $add parameter documentation.
indexers:
* omindex:
+ Fix typo in mimetypes used for Apple iWork documents ("apply" \
instead of
"apple") which meant that these documents weren't actually being \
indexed.
Patch from Bruno Baruffaldi.
+ Pipe input to ps2pdf as this accepts input on stdin. Possibility pointed
out by Gaurav Arora.
* scriptindex:
+ If parsedate action's format includes %z adjust for the timezone if
possible (this requires the non-POSIX tm_gmtoff member of struct tm)
and flag an error for other platforms.
+ If parsedate action's format include %Z flag an error as that doesn't
seem to be usefully supported by strptime() anywhere.
+ Fix parsedate action to treat formats without a timezone as being UTC
instead of localtime.
+ Add date=unixutc. The existing date=unix works in localtime which is
unhelpful if you want to use it on the output of parsedate since that's in
UTC; date=unixutc is just like date=unix except it always works in UTC.
+ The date action now emits a warning for invalid values. The documentation
used to say "invalid values are ignored at present", but it's more \
helpful
to flag bad data than quietly ignore it.
+ We now check the date action's parameter at script parse time and unknown
values result in an error and nothing being indexed. Previously an unknown
format uselessly resulted in the terms D, M and Y literally being added to
every document.
+ The split action now supports a new "prefixes" split style. This \
gives all
the prefixes from the split, so split=/,prefixes on a file path gives all
parent directories.
omega:
* Remove documented limitation of $subdb and $subid - the implementation
assumed that each omega database name corresponded to a single Xapian
database, and if a database name referred to a stub database file expanding
to multiple Xapian databases then they would misbehave. Such cases are now
handled properly as well.
* Extend $addfilter to support adding negated filters via a new optional second
argument which specifies the type of filter to add.
* Stop $sort from needlessly ensuring the match has run.
* Handle corner case of nested $hitlist gracefully instead of potentially
entering an infinite loop.
testsuite:
* omegatest: Avoid setting TZ globally during tests as that hides bugs where
behaviour depends on the local timezone when it shouldn't.
* omegatest: Support testing when built using LeakSanitizer by suppressing
leak reports for cached compiled pcre regular expressions. These aren't
released when the program exits but aren't memory leaks.
build system:
* Remove outdated deprecation warning suppression which was there to support
building from git in the run up to 1.3.2 - a development version which is
nearly 5 years ago now.
portability:
* Fix problems with fallback strptime() implementation which was being included
in the wrong binary, and was lacking a required const_cast on the return
value.
* Rework setenv() compatibility handling. Now that Solaris 9 is dead we can
assume setenv() is provided by Unix-like platforms (POSIX requires it). For
other platforms, provide a compatibility implementation of setenv() so the
compatibility code is encapsulated in one place rather than replicated at
every use.
|
2019-03-10 14:21:05 by Amitai Schleier | Files touched by this commit (3) |
Log message:
Avoid conflicting with system bswap32(). Use SUBST_VARS to mollify pkglint.
|