pkgsrc.se | The NetBSD package collection

Subject: CVS commit: pkgsrc/databases/redis
From: Adam Ciarcinski
Date: 2020-05-28 14:02:44
Message id: 20200528120244.CBDE8FB27@cvs.NetBSD.org
Log Message:
redis: updated to 6.0.4

Redis 6.0.4
===========

Upgrade urgency CRITICAL: this release fixes a severe replication bug.

Redis 6.0.4 fixes a critical replication bug caused by a new feature introduced
in Redis 6. The feature, called "meaningful offset" and strongly wanted by
myself (antirez) was an improvement that avoided that masters were no longer
able, during a failover where they were demoted to replicas, to partially
synchronize with the new master. In short the feature was able to avoid full
synchronizations with RDB. How did it work? By trimming the replication backlog
of the final "PING" commands the master was sending in the replication \ 
channel:
this way the replication offset would no longer go "after" the one of the
promoted replica, allowing the master to just continue in the same replication
history, receiving only a small data difference.

However after the introduction of the feature we (the Redis core team) quickly
understood there was something wrong: the apparently harmless feature had
many bugs, and the last bug we discovered, after a joined effort of multiple
people, we were not even able to fully understand after fixing it. Enough was
enough, we decided that the complexity cost of this feature was too high.
So Redis 6.0.4 removes the feature entirely, and fixes the data corruption that
it was able to cause.

However there are two facts to take in mind.

Fact 1: Setups using chained replication, that means that certain replicas
are replicating from other replicas, up to Redis 6.0.3 can experience data
corruption. For chained replication we mean that:

    +--------+          +---------+         +-------------+
    | master |--------->| replica |-------->| sub-replica |
    +--------+          +---------+         +-------------+

People using chained replication SHOULD UPGRADE ASAP away from Redis 6.0.0,
6.0.1, 6.0.2 or 6.0.3 to Redis 6.0.4.

To be clear, people NOT using this setup, but having just replicas attached
directly to the master, SHOUDL NOT BE in danger of any problem. But we
are no longer confident on 6.0.x replication implementation complexities
so we suggest to upgrade to 6.0.4 to everybody using an older 6.0.3 release.
We just so far didn't find any bug that affects Redis 6.0.3 that does not
involve chained replication.

People starting with Redis 6.0.4 are fine. People with Redis 5 are fine.
People upgrading from Redis 5 to Redis 6.0.4 are fine.
TLDR: The problem is with users of 6.0.0, 6.0.1, 6.0.2, 6.0.3.

Fact 2: Upgrading from Redis 6.0.x to Redis 6.0.4, IF AND ONLY IF you
use chained replication, requires some extra care:

1. Once you attach your new Redis 6.0.4 instance as a replica of the current
   Redis 6.0.x master, you should wait for the first full synchronization,
   then you should promote it right away, if your setup involves chained
   replication. Don't give it the time to do a new partial synchronization
   in the case the link between the master and the replica  will break in
   the mean time.

2. As an additional care, you may want to set the replication ping period
   to a very large value (for instance 1000000) using the following command:

       CONFIG SET repl-ping-replica-period 1000000

   Note that if you do "1" with care, "2" is not needed.
   However if you do it, make sure to later restore it to its default:

       CONFIG SET repl-ping-replica-period 10

So this is the main change in Redis 6. Later we'll find a different way in
order to achieve what we wanted to achieve with the Meaningful Offset feature,
but without the same complexity.

Other changes in this release:

* PSYNC2 tests improved.
* Fix a rare active defrag edge case bug leading to stagnation
* Fix Redis 6 asserting at startup in 32 bit systems.
* Redis 6 32 bit is now added back to our testing environments.
* Fix server crash for STRALGO command,
* Implement sendfile for RDB transfer.
* TLS fixes.
* Make replication more resistant by disconnecting the master if we
  detect a protocol error. Basically we no longer accept inline protocol
  from the master.
* Other improvements in the tests.
Files:
Revision	Action	file
1.53	modify	pkgsrc/databases/redis/Makefile
1.50	modify	pkgsrc/databases/redis/distinfo