./textproc/xapian-omega, Search engine application for websites using Xapian

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]

Branch: CURRENT, Version: 1.4.17nb1, Package name: xapian-omega-1.4.17nb1, Maintainer: schmonz

Omega operates on a set of databases. Each database is created and
updated separately using either omindex or scriptindex. You can
search these databases (or any other Xapian database with suitable
contents) via a web front-end provided by omega, a CGI application.
A search can also be done over more than one database at once.

Required to run:
[lang/perl5] [devel/pcre] [textproc/xapian]

Required to build:

Master sites:

SHA1: 7181ca7985cefc10125a5f5b5a64827f9dec4aac
RMD160: 47022bd5c79eaf012f617bd70b47926adde17754
Filesize: 534.543 KB

Version history: (Expand)

CVS history: (Expand)

   2020-08-31 20:13:29 by Thomas Klausner | Files touched by this commit (3631) | Package updated
Log message:
*: bump PKGREVISION for perl-5.32.
   2020-08-21 22:46:05 by Amitai Schleier | Files touched by this commit (1) | Package updated
Log message:
Update to 1.4.17. From the changelog:


* Document comment format supported by scriptindex index scripts.  We've
  supported comments on a line by themselves and introduced with a # since
  scriptindex was first added back in 2002, but it seems have never actually
  been documented before now.


* Check for SERVER_PROTOCOL=INCLUDED before anything which might throw an
  exception so that if it is set we suppress the Content-Type: when reporting
  such exceptions.  Spotted by Gaurav Arora.

* Report get_description() for Xapian::Error exceptions instead of get_msg().
  This means we now report the exception's type, context (useful for network
  errors), and errno information.

* Avoid leaking MyStopper object.  The object essentially has the lifespan of
  omega itself, but becomes unreachable when the QueryParser object is
  destroyed.  To make it easier to use leak-checking tools, hand ownership of
  this object to the QueryParser object.


* omegatest: Tell leak sanitizer not to report leaks for allocations which
  aren't explicitly released on exit - the OS will reclaim all memory from the
  process at this point and explicitly releasing everything just takes time for
  no real benefit.  We will still see leaks of objects which become unreachable
  during a run.
   2020-06-10 19:56:10 by Amitai Schleier | Files touched by this commit (1) | Package updated
Log message:
Update to 1.4.16. From the changelog:


* Fix handling of XML empty tag syntax when there's a quoted parameter right
  before the closing `/>`.  This caused `<title \ 
xml:lang="en-US"/>` to treat
  the body text as the document title.  Spotted by Gaurav Arora.

* omindex: Fix killing of filter child process if the parent process receives a
  signal.  Spotted by Gaurav Arora.


* Reject $setrelevant without an argument list.  This has never been documented
  as allowed, and previously crashed with a segfault.  Fixes #802, reported by
  Gaurav Arora.

* If there's an error opening the databases we now close any we managed to open
  successfully before the error so that things like $dbsize can't end up
  reporting values for a subset of the specified databases.


* Use our own autoconf cache variable namespace (xo_cv_ prefix instead of
  ac_cv_) to avoid colliding with standard autoconf macro use if config.site or
  a shared config.cache is used.  The former case caused a build failure for
  the OpenBSD port with 1.4.15, reported by Lucas R.
   2020-02-25 18:55:47 by Amitai Schleier | Files touched by this commit (2) | Package updated
Log message:
Update to 1.4.15. From the changelog:


* Update documentation about how to add a new format to omindex.  Patch from
  Bruno Baruffaldi.


* Check for a BOM on HTML files, which for HTML5 should determine the encoding.


* Allow $if{COND} without any actions which is useful as a way to evaluate
  something but ignore the result if you just want the side effects.  Indeed
  we were already recommending to use it if you want to ignore the return value
  of $log.  Fixes bug introduced in 1.4.14, reported by tuftedocelot.

* Add OmegaScript support for $jsonbool{COND} for encoding a boolean value for
  use in JSON.  This is equivalent to $if{COND,true,false} but more readable.

* Add OmegaScript support for $jsonobject{} which allows producing a JSON
  object from an OmegaScript map.

* Allow specifying a format to $jsonarray{} so it is no longer restricted to
  producing an array of strings.

* Add $keys{MAP} OmegaScript command which gives a sorted list of the keys from
  an OmegaScript map.


* Simplify probes for snprintf.  The broken snprintf in libbsd in Linux libc4
  is from ~25 years ago so way too ancient to matter now, and all callers
  already handle the pre-ISO semantics of returning -1 for an undersize buffer
  so we don't need to run a test program to probe for this at configure time,
  which is more cross-compile friendly.

* Avoid deprecation warning on recent Linux.  We were including sys/sysctl.h if
  it existed, which it does on Linux but we don't actually use it there.
  Including it now warns that it is deprecated, so skip including it under
  Linux.  Reported on IRC by kumaran.
   2019-12-17 04:54:18 by Amitai Schleier | Files touched by this commit (2) | Package updated
Log message:
Update to 1.4.14. From the changelog:


* Improve omindex --help docs for --duplicates.
* Document that $log will start to return an error message in 1.5.0, and that
  one can wrap it using a $if with no action now to be future-proof.


* Add built-in support for iso-8859-15 so we can handle it without iconv.
  This charset is a variant of iso-8859-1 with 8 characters changed, most
  notably including the euro currency symbol.  It's the most commonly seen
  charset we didn't have built-in support for.
* Optimise converting us-ascii to UTF-8 to do nothing, like we already do when
  converting UTF-8 to UTF-8.
* scriptindex:
  + Add new 'gap' action which provides a way to leave a gap in the term
    positions between fields to prevent phrases and positional operators from
    matching across fields.


* Fix error handling in $lookup.  We now check for errors from cdb_init()
  and cdb_get().  We've never checked for errors from cdb_init(), while
  for cdb_get() this bug was introduced by a warning fix in 1.2.20.


* Future-proof use of $log against changes in 1.5.0.
   2019-08-11 15:25:21 by Thomas Klausner | Files touched by this commit (3557) | Package updated
Log message:
Bump PKGREVISIONs for perl 5.30.0
   2019-08-02 23:29:11 by Amitai Schleier | Files touched by this commit (2) | Package updated
Log message:
Update to 1.4.12. From the changelog:


* Improve docs for OmegaScript $hitlist{}.

* Fix RST formatting errors in omega docs.

* Clarify use of Q prefix for unique ID terms - it was described as \ 
  but the use of "Q" is really just a convention (and in fact omindex \ 
uses "U"
  not "Q").

* Clarify scriptindex's weight action takes parameter >= 0.

* Correct typo in OmegaScript $add parameter documentation.


* omindex:

  + Fix typo in mimetypes used for Apple iWork documents ("apply" \ 
instead of
    "apple") which meant that these documents weren't actually being \ 
    Patch from Bruno Baruffaldi.

  + Pipe input to ps2pdf as this accepts input on stdin.  Possibility pointed
    out by Gaurav Arora.

* scriptindex:

  + If parsedate action's format includes %z adjust for the timezone if
    possible (this requires the non-POSIX tm_gmtoff member of struct tm)
    and flag an error for other platforms.

  + If parsedate action's format include %Z flag an error as that doesn't
    seem to be usefully supported by strptime() anywhere.

  + Fix parsedate action to treat formats without a timezone as being UTC
    instead of localtime.

  + Add date=unixutc.  The existing date=unix works in localtime which is
    unhelpful if you want to use it on the output of parsedate since that's in
    UTC; date=unixutc is just like date=unix except it always works in UTC.

  + The date action now emits a warning for invalid values.  The documentation
    used to say "invalid values are ignored at present", but it's more \ 
    to flag bad data than quietly ignore it.

  + We now check the date action's parameter at script parse time and unknown
    values result in an error and nothing being indexed.  Previously an unknown
    format uselessly resulted in the terms D, M and Y literally being added to
    every document.

  + The split action now supports a new "prefixes" split style.  This \ 
gives all
    the prefixes from the split, so split=/,prefixes on a file path gives all
    parent directories.


* Remove documented limitation of $subdb and $subid - the implementation
  assumed that each omega database name corresponded to a single Xapian
  database, and if a database name referred to a stub database file expanding
  to multiple Xapian databases then they would misbehave.  Such cases are now
  handled properly as well.

* Extend $addfilter to support adding negated filters via a new optional second
  argument which specifies the type of filter to add.

* Stop $sort from needlessly ensuring the match has run.

* Handle corner case of nested $hitlist gracefully instead of potentially
  entering an infinite loop.


* omegatest: Avoid setting TZ globally during tests as that hides bugs where
  behaviour depends on the local timezone when it shouldn't.

* omegatest: Support testing when built using LeakSanitizer by suppressing
  leak reports for cached compiled pcre regular expressions.  These aren't
  released when the program exits but aren't memory leaks.

build system:

* Remove outdated deprecation warning suppression which was there to support
  building from git in the run up to 1.3.2 - a development version which is
  nearly 5 years ago now.


* Fix problems with fallback strptime() implementation which was being included
  in the wrong binary, and was lacking a required const_cast on the return

* Rework setenv() compatibility handling.  Now that Solaris 9 is dead we can
  assume setenv() is provided by Unix-like platforms (POSIX requires it).  For
  other platforms, provide a compatibility implementation of setenv() so the
  compatibility code is encapsulated in one place rather than replicated at
  every use.
   2019-03-10 14:21:05 by Amitai Schleier | Files touched by this commit (3)
Log message:
Avoid conflicting with system bswap32(). Use SUBST_VARS to mollify pkglint.