Subject: CVS commit: pkgsrc/databases/apache-solr
From: Jean-Yves Migeon
Date: 2021-12-18 16:34:20
Message id: 20211218153420.47FDFFAEC@cvs.NetBSD.org

Log Message:
Upgrade apache-solr to 8.11.1.

Please consult the Solr Upgrade notes for details and links specified below:

    \ 
https://github.com/apache/solr/blob/main/solr/solr-ref-guide/src/solr-upgrade-notes.adoc

====================================================================

Changelog:

Solr 8.11

See the 8.11 Release Notes for an overview of the main new features of Solr 8.11.

When upgrading to 8.11.x users should be aware of the following major changes \ 
from 8.10.

Support for Multiple Authentication Schemes

Two new authentication and authorization plugins provide support for configuring \ 
multiple authentication schemes.

The MultiAuthPlugin allows combining two or more authentication approaches, such \ 
as JWT and Basic authentication.

The MultiAuthRuleBasedAuthorizationPlugin is used when the MultiAuthPlugin is \ 
also in use, and combines the various roles defined for all plugins to determine \ 
the proper role assignment for the user account.

For information on configuring these plugins, see the following sections:

    Combining Basic Authentication with Other Schemes

    Multiple Authorization Plugins

ZooKeeper chroot

It’s now possible to create the ZooKeeper chroot at startup if it does not \ 
already exist. See the section Using the -z Parameter with bin/solr for an \ 
example.

Other Changes

A few other minor changes are worth noting:

    The config-read pre-defined permission now correctly governs access for \ 
various configuration-related APIs. See also Predefined Permissions.

    The S3BackupRepository supports configuring the AWS Profile, if necessary. \ 
See also S3BackupRepository.

    Additionally, backups will now properly succeed after SPLITSHARD operations, \ 
and will correctly handle incremental backup purges.

    SolrJ now supports uploading configsets.

Solr 8.10

See the 8.10 Release Notes for an overview of the main new features of Solr 8.10.

When upgrading to 8.10.x users should be aware of the following major changes \ 
from 8.9.

Schema Designer UI

A new screen has been added to the Admin UI that allows you to interactively \ 
design a Solr schema using your documents.

The designer screen provides a safe environment for you to:

    Upload or paste sample documents to identify fields.

    Get a "first" guess at what Solr thinks the field types in the \ 
fields should be.

    Edit fields, field types, dynamic fields, and supporting files.

    See how a field’s analysis will impact your text.

    Test how schema changes will impact query-time behavior.

    Save your changes to a configset to use with a new collection.

See the section Schema Designer for full details.

Backups in S3

Following the redesign of backups in Solr 8.8 that allowed storage of \ 
incremental backups in Google Cloud environments, Solr 8.10 provides support for \ 
storing backups in Amazon S3 buckets.

See the section S3BackupRepository for how to configure.

Security Admin UI

Solr’s Admin UI also got a new screen to support management of users, roles, \ 
and permissions.

The new UI works when authentication and/or authorization has been enabled with \ 
bin/solr auth or by manually installing a security.json file. Before this, it \ 
provides a warning that your Solr instance is unsecured.

See the section Security UI for details.

Solr SQL Improvements

A number of improvements have been made in Solr’s SQL functionality:

    Support added for LIKE, IS NOT NULL, IS NULL, and wildcards (for simplistic \ 
LIKE functionality).

    Two new aggregation functions, COUNT(DISTINCT field) and \ 
APPROX_COUNT_DISTINCT(field), have been added.

    Queries using an ORDER BY clause can support OFFSET and FETCH operations.

    Multi-valued fields can now be returned.

    User permissions have been simplified so access to query endpoints /sql, \ 
/select, and /export is sufficient for full access for all SQL queries.

shards.preference

A new option for the shards.preference parameter allows selection of nodes based \ 
on whether or not the replica is a leader. Now adding \ 
shards.preference=replica.leader:false will limit queries only to replicas which \ 
are not currently their shard’s leader.

See the section shards.preference Parameter for details and examples.

Metrics & Prometheus Exporter

A new expr option in the Metrics API allows for more advanced filtering of \ 
metrics based on regular expressions. See the section Metrics API for examples.

The Prometheus Exporter’s default solr-exporter.config has been improved to \ 
use the new expr option in the Metrics API to get a smaller set of metrics. The \ 
default metrics exported still include most metrics, but the configuration will \ 
be easier to trim as needed. This should help provide performance improvements \ 
in busy clusters being monitored by Prometheus.

ZooKeeper Credentials

ZooKeeper credentials can now be stored in a file whose location is defined with \ 
a system property instead of being passed in plain-text. See Out of the Box \ 
Credential Implementations for how to set this up.
Solr 8.9

See the 8.9 Release Notes for an overview of the main new features of Solr 8.9.

When upgrading to 8.9.x users should be aware of the following major changes \ 
from 8.8.

Backup and Restore

Solr 8.9 introduces extensive changes to Solr’s backup and restore support.

A new backup format has been introduced in Solr 8.9 which replaces the previous \ 
snapshot-based backup. This new format enables ‘incremental’ backups. \ 
Repeated backups to a given location will take advantage of the data stored by \ 
their predecessors and will only operate on files that have changed since the \ 
previous backup. This is supported by default, simply by storing each backup \ 
file in the same location.

The old and new formats are not compatible, although backups in the old format, \ 
a full snapshot of all files, can still be used to restore to Solr for the \ 
time-being. The old format is officially deprecated, and support for it is \ 
likely to be removed in Solr 9.0.

For the time-being the old format can be created by defining a parameter \ 
incremental=false. Again, though, this support is likely to be removed in Solr \ 
9.0.

More documentation on backups is available at Backup and Restore.

New Collections API commands for backups:

    LISTBACKUP: Lists information about each backup stored at the specified \ 
repository location. See List Backups for more details.

    DELETEBACKUP: Deletes specified backups from the repository. See Delete \ 
Backups for more details.

A new option for backup repository is also available in 8.9, which is to use \ 
Google Cloud Storage (GCS). This is a contrib (located in \ 
contrib/gcs-repository). See GCSBackupRepository for configuration details. The \ 
Solr community is working to add support for S3 buckets in the near future.

Nested Docs

Child Doc Transformer’s childFilter parameter no longer applies query syntax \ 
escaping because it’s inconsistent with the rest of Solr and was limiting. \ 
This refers to [child childFilter='field:value']. There was no escaping here \ 
prior to 8.0 either.

Collapse and Expand

    BlockCollapse: If documents have been (or could be) indexed in a way where \ 
documents with the same collapse key have been indexed contiguously in the \ 
index, a new "block collapse" provides a significant speed improvement \ 
over traditional collapse.

    See Block Collapsing for details.

    Expand Null Groups: A new parameter expand.nullGroup allows an expanded \ 
group to be returned containing document with no value in the expanded field. \ 
See Expand Component for details.

In-Place Updates

A new request parameter update.partial.requireInPlace=true allows telling Solr \ 
to "fail fast" if all of the necessary conditions are not satisfied to \ 
allow an in-place update to succeed. See also In-Place Updates.

Metrics History

The Metrics History feature, which allowed long-term storage and aggregation of \ 
Solr’s metrics, has been deprecated and will be removed in 9.0.

Embedded Solr Server

When using EmbeddedSolrServer, it will no longer close CoreContainer instances \ 
that were passed to it.
Solr 8.8

When upgrading to 8.8.x users should be aware of the following major changes \ 
from 8.7.

Nested Documents

    When doing atomic/partial updates to a child document:

        Supply the _root_ field (the ID of the root document) so that Solr \ 
understands you are manipulating a child document and not a root document. In \ 
its absence, Solr looks at the _route_ parameter but this may change in the \ 
future because it’s not an ideal substitute. If neither are present, Solr \ 
assumes you are updating a root document. If this assumption is false, Solr will \ 
do a cheap check that usually detects the problem and will throw an exception to \ 
alert you of the need to specify the Root ID. This backwards incompatible change \ 
was done to increase performance and robustness.

        This feature no longer requires stored=true or docValues=true on the \ 
_root_ field. You might have it for other purposes though (e.g., for \ 
uniqueBlock(…​)).

        This feature no longer requires the _nest_path_ field, although you \ 
probably ought to continue to define it as it’s useful for other things.

Removed Contribs

    The search results clustering contrib (Carrot2) has been removed from 8.x \ 
Solr due to lack of Java 1.8 compatibility in the dependency that provides \ 
online clustering of search results. The contrib will be re-introduced in Solr \ 
9.0.

Learning to Rank

    Interleaving support has been added to Learning to Rank (LTR). Currently \ 
only the Team Draft Interleaving algorithm is supported. For examples using this \ 
feature, see the section Running a Rerank Query Interleaving Two Models.

Metrics

    Two metrics have been added for SolrCloud’s Overseer:

        solr_metrics_overseer_stateUpdateQueueSize

        solr_metrics_overseer_collectionWorkQueueSize

Prometheus Exporter

    The ./bin scripts included with the Prometheus Exporter contrib now allow \ 
use of custom java options with environment variables. See the section \ 
Environment Variable Options for more details.

    The default Grafana dashboards now include panels for query performance \ 
monitoring. The default Prometheus Exporter configuration includes metrics like \ 
queries-per-second (QPS) and 95th percentiles (P95) to populate the new panels.

    The default Prometheus Exporter configuration also includes the two new \ 
metrics mentioned in the Metrics above.

Solr Home

    The internal logic for identifying 'Solr Home' (solr.solr.home) has been \ 
refactored to make testing less error prone. Plugin developers using \ 
SolrPaths.locateSolrHome() or 'new `SolrResourceLoader’ should check \ 
deprecation warnings as existing some existing functionality will be removed in \ 
9.0. SOLR-14934 has more technical details about this change for those \ 
concerned.

base_url removed from stored state

As of Solr 8.8.0, the base_url property was removed from the stored state for \ 
replicas (SOLR-12182). If you’re able to upgrade SolrJ to 8.8.x for all of \ 
your client applications, then you can set -Dsolr.storeBaseUrl=false (introduced \ 
in Solr 8.8.1) to better align the stored state in ZooKeeper with future \ 
versions of Solr. However, if you are not able to upgrade SolrJ to 8.8.x for all \ 
client applications, then leave the default -Dsolr.storeBaseUrl=true so that \ 
Solr will continue to store the base_url in ZooKeeper.

You may also see some NPE in collection state updates during a rolling upgrade \ 
to 8.8.0 from a previous version of Solr. After upgrading all nodes in your \ 
cluster to 8.8.0, collections should fully recover. Trigger another rolling \ 
restart if there are any replicas that do not recover after the upgrade to \ 
re-elect leaders.
Solr 8.7

See the 8.7 Release Notes for an overview of the main new features of Solr 8.7.

When upgrading to 8.7.x users should be aware of the following major changes \ 
from 8.6.

Autoscaling

    If upgrading from 8.6.0, please see the 8.6.1 Upgrade notes below for \ 
information on performance degradations introduced in 8.6.0 that require some \ 
intervention to resolve. If you are already on 8.6.1 or higher, you can ignore \ 
these instructions.

Deprecations

    The autoscaling framework is now formally deprecated and will be removed in \ 
Solr 9.0. The Solr community is working on pluggable API to replace this \ 
functionality, with the goal for it to be ready by the time 9.0 is released. \ 
Deprecations include: autoscaling policy, triggers, withCollection support, \ 
simulation framework, autoscaling suggestions tab in the UI, autoAddReplicas and \ 
UTILIZENODE command.

    Similarly, rule-based replica placement strategy has been deprecated and \ 
will be replaced in Solr 9.0 by APIs for replica placement and cluster events, \ 
with plugin-based implementations.

    Support for detecting spinning disks has been removed in LUCENE-9576. \ 
Corresponding spins metrics in Solr still exist but now they always return false \ 
and will be removed in Solr 9.0.

User-Managed Cluster Terminology Updated

    Solr has replaced the terms "master" and "slave" in the \ 
codebase and all documentation with "leader" and "follower".

    This functionality has only changed in terms of parameter names changed, and \ 
we do not expect any back-compatibility issues on upgrade to 8.7 or even 9.0 \ 
later.

    However, users should update their solrconfig.xml files after completing the \ 
upgrade on all nodes of a cluster. Comparing your configuration to the updated \ 
configuration examples in Index Replication will show examples of what needs to \ 
change, but here are the main changes:

        On the replication leader, in the definition of the /replication request \ 
handler:

            Replace "master" with "leader".

            Replace "slave" with "follower" if the former \ 
term is used in the name of any follower solrconfig.xml file definitions. This \ 
file can be named anything, so you can change it to whatever you’d like to \ 
call it if you’d like.

            Replace "slave" with "follower" if the former \ 
term is used in a replication repeater configuration.

        On the replication follower, in the definition of the /replication \ 
request handler:

            Replace "masterUrl" with "leaderUrl".

            Replace "slave" with "follower" if the former \ 
term is used in a repeater configuration.

JSON Facets

    Performance enhancements for the relatedness() statistics function are \ 
included with 8.7. These yield the highest benefits with high-cardinality fields \ 
and can be disabled if working with lower cardinality fields with a new \ 
sweep_collection parameter. See the section relatedness() Options for details.

solr.in.sh / solr.in.cmd

    Solr has relied on the SOLR_STOP_WAIT parameter defined in solr.in.sh or \ 
solr.in.cmd to determine how long to wait on startup. A new parameter \ 
SOLR_START_WAIT allows defining how long Solr should wait for start up to \ 
complete.

    If the time set by this parameter is exceeded, Solr will exit the startup \ 
process and return the last few lines of the solr.log file to the terminal.

    By default, this parameter is set to the same value as SOLR_STOP_WAIT.

    The default ZooKeeper client timeout (ZK_CLIENT_TIMEOUT) is now 30 seconds \ 
(30000 milliseconds) instead of 15.

Configsets

    It’s now possible to overwrite an existing configset when uploading \ 
changes by supplying the overwrite=true parameter to the Configset API.

    A related parameter is cleanup=true, which allows deleting any files from \ 
the old configset that are left behind after the overwrite.

    The default for both of these parameters is false.

    When deleting a collection that has an automatically created configset \ 
(i.e., the configset was copied from the _default collection when the collection \ 
was created), the configset will also be deleted if it is not in use by any \ 
other collection.

Logging

    A request ID (rid) is now logged for all distributed search requests (in \ 
SolrCloud) which can be used to correlate query events across the system. A \ 
parameter disableRequestId=true can be added to disable this if desired.

Solr 8.6.1

See the 8.6.1 Release Notes for an overview of the fixes included in Solr 8.6.1.

When upgrading to 8.6.1 users should be aware of the following major changes \ 
from 8.6.0.

Autoscaling

    As mentioned in the 8.6 upgrade notes, a default autoscaling policy was \ 
provided starting in 8.6.0. This default autoscaling policy resulted in \ 
increasingly slow collection creation calls in large clusters (50+ collections).

    In 8.6.1 the default autoscaling policy has been removed, and clusters will \ 
not use autoscaling unless a policy has explicitly been created. If your cluster \ 
is running 8.6.0 and not using an explicit autoscaling policy, upgrade to 8.6.1 \ 
and remove the default cluster policy and preferences via the following command.

    Replace localhost:8983 with your Solr endpoint.

    curl -X POST -H 'Content-type:application/json'  -d '{set-cluster-policy : \ 
[], set-cluster-preferences : []}' http://localhost:8983/api/cluster/autoscaling

    This information is only relevant for users upgrading from 8.6.0. If \ 
upgrading from an earlier version to 8.6.1+, this warning can be ignored.

Solr 8.6

See the 8.6 Release Notes for an overview of the main new features of Solr 8.6.

When upgrading to 8.6.x users should be aware of the following major changes \ 
from 8.5.

Support for Block-Max WAND

Lucene added support for Block-Max WAND in 8.0, and 8.6 makes this available for \ 
Solr also.

This can provide significant performance enhancements by not calculating the \ 
score for results which are not likely to appear in the top set of results.

It is enabled when using a new query parameter minExactCount. This parameter \ 
tells Solr to accurately count the number of hits accurately until at least this \ 
value. Once this value is reached, Solr can skip over documents that don’t \ 
have a score high enough to be in the top set of documents, which has the \ 
potential for greatly speeding up searches.

It’s important to note that when using this parameter, the hit count of \ 
searches may not be accurate. It is guaranteed that the hit count is accurate up \ 
to the value of minExactCount, but any returned hit count higher than that may \ 
be an approximation.

A new boolean attribute numFoundExact is included in all responses to indicate \ 
if the hit count in the response is expected to be exact or not.

More information about this new feature is available in the section \ 
minExactCount Parameter.

Autoscaling

    NOTE: The default autoscaling policy has been removed as of 8.6.1

    Solr now includes a default autoscaling policy. This policy can be \ 
overridden with your custom rules or by specifying an empty policy to replace \ 
the default.

    The ComputePlan action now supports a collection selector to identify \ 
collections based on collection properties to determine which collections should \ 
be operated on.

Security

    Prior to Solr 8.6 Solr APIs which take a file system location, such as core \ 
creation, backup, restore, and others, did not validate the path and Solr would \ 
allow any absolute or relative path. Starting in 8.6 only paths that are \ 
relative to SOLR_HOME, SOLR_DATA_HOME and coreRootDir are allowed by default.

    If you need to create a core or store a backup outside the default paths, \ 
you will need to tell Solr which paths to allow. A new element in solr.xml \ 
called allowPaths takes a comma-separated list of allowed paths.

    When using the solr.xml file that ships with 8.6, you can configure the list \ 
of paths to allow through the system property solr.allowPaths. Please see \ 
bin/solr.in.sh or bin\solr.in.cmd for example usage. Using the value * will \ 
allow any path as in earlier versions.

    For more on this, see the section Solr.xml Parameters.

    Windows SMB shares on the UNC format, such as \\myhost\myshare\mypath are \ 
now always disallowed. Please use drive letter mounts instead, i.e., S:\mypath.

    A new authorization plugin ExternalRoleRuleBasedAuthorizationPlugin is now \ 
available. This plugin allows an authentication plugin (such as JWT) to supply a \ 
user’s roles instead of maintaining a user-to-role mapping inside Solr.

    When authentication is enabled, the Admin UI Dashboard (main screen) now \ 
includes a panel that shows the authentication and authorization plugins in use, \ 
the logged in username, and the roles assigned to this user. A new link will \ 
also appear in the left-hand navigation to allow a user to log out.

Streaming Expressions

    The /export handler now supports streaming expressions to allow limiting the \ 
output of the export to only matching documents.

    For more information about how to use this, see the section Specifying the \ 
Local Streaming Expression.

    The stats, facet, and timeseries expressions now support percentiles and \ 
standard deviation aggregations.

Highlighting

For the Unified Highlighter: The setting hl.fragsizeIsMinimum now defaults to \ 
false because true was found to be a significant performance regression when \ 
highlighting lots of text. This will yield longer highlights on average compared \ 
to Solr 8.5 but relatively unchanged compared to previous releases. Furthermore, \ 
if your application highlights lots of text, you may want to experiment with \ 
lowering hl.fragAlignRatio to trade ideal fragment alignment for better \ 
performance.

Deprecations

A primary focus of the community is improving Solr’s stability and \ 
supportability. With the addition of the package manager system in 8.4, we now \ 
have the ability to move some features into plugins maintained by third-parties \ 
with the expertise to properly develop and support them. Our goal is to make \ 
running Solr easier and less prone to outages and other headaches. In this \ 
spirit, the following features have been deprecated and are slated to be removed \ 
in Solr 9.0.

    Cross Data Center Replication (CDCR), in its current form, is deprecated and \ 
is scheduled to be removed in 9.0. This feature is unlikely to be replaced by an \ 
identical plugin. However, the community is working on figuring out a \ 
replacement feature for disaster recovery and failover.

    The Data Import Handler (DIH) is deprecated and is scheduled to be removed \ 
in 9.0. Work to replace DIH with a community-supported plugin is underway and \ 
may be available soon.

    Support to store indexes and backups in HDFS is deprecated and is scheduled \ 
to be removed in 9.0. A community-supported version of this may be available as \ 
a plugin in the future. For more details, please see SOLR-14021.

Users interested in maintaining a feature as a plugin are encouraged to join the \ 
developer mailing list to find out more about how to help.

Files:
RevisionActionfile
1.4modifypkgsrc/databases/apache-solr/Makefile
1.3modifypkgsrc/databases/apache-solr/PLIST
1.5modifypkgsrc/databases/apache-solr/distinfo
1.2modifypkgsrc/databases/apache-solr/patches/patch-solr-bin