Subject: CVS commit: pkgsrc/devel/kafka
From: Filip Hajny
Date: 2018-04-05 10:46:37
Message id: 20180405084637.688A2FBEC@cvs.NetBSD.org

Log Message:
devel/kafka: Update to 1.1.0.

New Feature

- automatic migration of log dirs to new locations
- KIP-145 - Expose Record Headers in Kafka Connect
- Add the AdminClient in Streams' KafkaClientSupplier
- Support dynamic updates of frequently updated broker configs

Improvement

- KafkaConnect should support regular expression for topics
- Move kafka-streams test fixtures into a published package
- SSL support for Connect REST API
- Grow default heap settings for distributed Connect from 256M to 1G
- Enable access to key in ValueTransformer
- Add "getAllKeys" API for querying windowed KTable stores
- Unify StreamsKafkaClient instances
- Revisit Streams DSL JavaDocs
- Extend Consumer Group Reset Offset tool for Stream Applications
- KIP-175: ConsumerGroupCommand no longer shows output for consumer
  groups which have not committed offsets
- Add a broker metric specifying the number of consumer group
  rebalances in progress
- Use Jackson for serialising to JSON
- KafkaShortnamer should allow for case-insensitive matches
- Improve Util classes
- Gradle 3.0+ is needed on the build
- Adding records deletion operation to the new Admin Client API
- Kafka metrics templates used in document generation should maintain
  order of tags
- Provide for custom error handling when Kafka Streams fails to
  produce
- Make Repartition Topics Transient
- Connect Schema comparison is slow for large schemas
- Add a Validator for NonNull configurations and remove redundant null
  checks on lists
- Have State Stores Restore Before Initializing Toplogy
- Optimize condition in if statement to reduce the number of
  comparisons
- Removed unnecessary null check
- Introduce Incremental FetchRequests to Increase Partition
  Scalability
- SSLTransportLayer should keep reading from socket until either the
  buffer is full or the socket has no more data
- Improve KTable Source state store auto-generated names
- Extend consumer offset reset tool to support deletion (KIP-229)
- Expose Kafka cluster ID in Connect REST API
- Maven artifact for kafka should not depend on log4j
- ConsumerGroupCommand should use the new consumer to query the log
  end offsets.
- Change LogSegment.delete to deleteIfExists and harden log recovery
- Make ProducerConfig and ConsumerConfig constructors public
- Improve synchronization in CachingKeyValueStore methods
- Improve Kafka GZip compression performance
- Improve JavaDoc of SourceTask#poll() to discourage indefinite
  blocking
- Avoid creating dummy checkpoint files with no state stores
- Change log level from ERROR to WARN for not leader for this
  partition exception
- Delay initiating the txn on producers until initializeTopology with
  EOS turned on

Bug

- change log4j to slf4j
- Use RollingFileAppender by default in log4j.properties
- Cached zkVersion not equal to that in zookeeper, broker not
  recovering.
- FileRecords.read doesn't handle size > sizeInBytes when start is not
  zero
- a soft failure in controller may leave a topic partition in an
  inconsistent state
- Cannot truncate to a negative offset (-1) exception at broker
  startup
- automated leader rebalance causes replication downtime for clusters
  with too many partitions
- kafka-run-class has potential to add a leading colon to classpath
- QueryableStateIntegrationTest.concurrentAccess is failing
  occasionally in jenkins builds
- FileStreamSource Connector not working for large files (~ 1GB)
- KeyValueIterator returns null values
- KafkaProducer is not joining its IO thread properly
- Kafka connect: error with special characters in connector name
- Replace StreamsKafkaClient with AdminClient in Kafka Streams
- LogCleaner#cleanSegments should not ignore failures to delete files
- Connect Rest API allows creating connectors with an empty name -
  KIP-212
- KerberosLogin#login should probably be synchronized
- Support replicas movement between log directories (KIP-113)
- Consumer ListOffsets request can starve group heartbeats
- Struct.put() should include the field name if validation fails
- Clarify handling of connector name in config
- Allow user to specify relative path as log directory
- KafkaConsumer should validate topics/TopicPartitions on
  subscribe/assign
- Controller should only update reassignment znode if there is change
  in the reassignment data
- records.lag should use tags for topic and partition rather than
  using metric name.
- KafkaProducer should not wrap InterruptedException in close() with
  KafkaException
- Connect classloader isolation may be broken for JDBC drivers
- JsonConverter generates "Mismatching schema" DataException
- NoSuchElementException in markErrorMeter during
  TransactionsBounceTest
- Make KafkaFuture.Function java 8 lambda compatible
- ThreadCache#sizeBytes() should check overflow
- KafkaFuture timeout fails to fire if a narrow race condition is hit
- DeleteRecordsRequest to a non-leader
- ReplicaFetcherThread should close the ReplicaFetcherBlockingSend
  earlier on shutdown
- NoSuchMethodError when creating ProducerRecord in upgrade system
  tests
- Running tools on Windows fail due to typo in JVM config
- Streams metrics tagged incorrectly
- ducker-ak: add ipaddress and enum34 dependencies to docker image
- Kafka cannot recover after an unclean shutdown on Windows
- Scanning plugin.path needs to support relative symlinks
- Reconnecting to broker does not exponentially backoff
- TaskManager should be type aware
- Major performance issue due to excessive logging during leader
  election
- RecordCollectorImpl should not retry sending
- Restore and global consumer should not use auto.offset.reset
- Global Consumer should handle TimeoutException
- Reduce rebalance time by not checking if created topics are
  available
- VerifiableConsumer with --max-messages doesn't exit
- Transaction markers are sometimes discarded if txns complete
  concurrently
- Simplify StreamsBuilder#addGlobalStore
- JmxReporter can't handle windows style directory paths
- CONSUMER-ID and HOST values are concatenated if the CONSUMER-ID is >
  50 chars
- ClientQuotaManager threads prevent shutdown when encountering an
  error loading logs
- Streams configuration requires consumer. and producer. in order to
  be read
- Timestamp on streams directory contains a colon, which is an illegal
  character
- Add methods in Options classes to keep binary compatibility with
  0.11
- RecordQueue.clear() does not clear MinTimestampTracker's maintained
  list
- Selector memory leak with high likelihood of OOM in case of down
  conversion
- GlobalKTable never finishes restoring when consuming transactional
  messages
- Server crash while deleting segments
- IllegalArgumentException if 1.0.0 is used for
  inter.broker.protocol.version or log.message.format.version
- Using standby replicas with an in memory state store causes Streams
  to crash
- Issues with protocol version when applying a rolling upgrade to
  1.0.0
- Fix system test dependency issues
- Kafka Connect requires permission to create internal topics even if
  they exist
- A metric named 'XX' already exists, can't register another one.
- Improve sink connector topic regex validation
- Flaky Unit test:
  KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegionWithNonZeroByteCache
- Make KafkaStreams.cleanup() clean global state directory
- AbstractCoordinator not clearly handles NULL Exception
- Request logging throws exception if acks=0
- GlobalKTable missing #queryableStoreName()
- KTable state restore fails after rebalance
- Make loadClass thread-safe for class loaders of Connect plugins
- System Test failed: ConnectRestApiTest
- Broken symlink interrupts scanning the plugin path
- NetworkClient should not return internal failed api version
  responses from poll
- Transient failure in
  NetworkClientTest.testConnectionDelayDisconnected
- Line numbers on log messages are incorrect
- Topic can not be recreated after it is deleted
- mBeanName should be removed before returning from
  JmxReporter#removeAttribute()
- Kafka Core should have explicit SLF4J API dependency
- StreamsResetter should return non-zero return code on error
- kafka-acls regression for comma characters (and maybe other
  characters as well)
- Error deleting log for topic, all log dirs failed.
- punctuate with WALL_CLOCK_TIME triggered immediately
- Exclude node groups belonging to global stores in
  InternalTopologyBuilder#makeNodeGroups
- Transient failure in
  \ 
kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.Admin \ 
ClientIntegrationTest.testAlterReplicaLogDirs
- NetworkClient.inFlightRequestCount() is not thread safe, causing
  ConcurrentModificationExceptions when sensors are read
- ConcurrentModificationException during streams state restoration
- Update KStream JavaDoc with regard to KIP-182
- RocksDB segments not removed when store is closed causes
  re-initialization to fail
- auto commit not work since coordinatorUnknown() is always true.
- Fix StateRestoreListener To Use Correct Batch Ending Offset
- NullPointerException on KStream-GlobalKTable leftJoin when
  KeyValueMapper returns null
- StreamThread.shutdown doesn't clean up completely when called before
  StreamThread.start
- Update ZooKeeper to 3.4.11, Gradle and other minor updates
- output from ensure copartitioning is not used for Cluster metadata,
  resulting in partitions without tasks working on them
- Consumer should not block setting initial positions of unavailable
  partitions
- Non-aggregation KTable generation operator does not construct value
  getter correctly
- AdminClient should handle empty or null topic names better
- When enable trace level log in mirror maker, it will throw null
  pointer exception and the mirror maker will shutdown
- Simplify KStreamReduce
- Base64URL encoding under JRE 1.7 is broken due to incorrect padding
  assumption
- ChangeLoggingKeyValueBytesStore.all() returns null
- Fetcher.retrieveOffsetsByTimes() should add all the topics to the
  metadata refresh topics set.
- LogSemgent.truncateTo() should always resize the index file
- Connect: Plugin scan is very slow
- Connect: Some per-task-metrics not working
- Connect header parser incorrectly parses arrays
- Java Producer: Excessive memory usage with compression enabled
- New Connect header support doesn't define 'converter.type' property
  correctly
- ZooKeeperClient holds a lock while waiting for responses, blocking
  shutdown
- Transient failure in
  DynamicBrokerReconfigurationTest.testThreadPoolResize
- Broker leaks memory and file descriptors after sudden client
  disconnects
- Delegation token internals should not impact public interfaces
- Streams quickstart pom.xml is missing versions for a bunch of
  plugins
- Deadlock while processing Controller Events
- Broker doesn't reject Produce request with inconsistent state
- LogCleanerManager.doneDeleting() should check the partition state
  before deleting the in progress partition
- KafkaController.brokerInfo not updated on dynamic update
- Connect standalone SASL file source and sink test fails without
  explanation
- Connect distributed and standalone worker 'main()' methods should
  catch and log all exceptions
- Consumer bytes-fetched and records-fetched metrics are not
  aggregated correctly
- Coordinator disconnect in heartbeat thread can cause commitSync to
  block indefinitely
- Regression in consumer auto-commit backoff behavior
- GroupMetadataManager.loadGroupsAndOffsets decompresses record batch
  needlessly
- log segment deletion could cause a disk to be marked offline
  incorrectly
- Delayed operations may not be completed when there is lock
  contention
- Expression for GlobalKTable is not correct
- System tests do not handle ZK chroot properly with SCRAM
- Fix config initialization in DynamicBrokerConfig
- ReplicaFetcher crashes with "Attempted to complete a transaction
  which was not started"

Test

- Add concurrent tests to exercise all paths in group/transaction
  managers
- Add unit tests for ClusterConnectionStates
- Only delete reassign_partitions znode after reassignment is complete
- KafkaStreamsTest fails in trunk
- SelectorTest may fail with ConcurrentModificationException

Sub-task

- Add capability to create delegation token
- Add authentication based on delegation token.
- Add capability to renew/expire delegation tokens.
- always leave the last surviving member of the ISR in ZK
- handle ZK session expiration properly when a new session can't be
  established
- Streams should not re-throw if suspending/closing tasks fails
- Use async ZookeeperClient in Controller
- Use async ZookeeperClient in SimpleAclAuthorizer
- Use async ZookeeperClient for DynamicConfigManager
- Use async ZookeeperClient for Admin operations
- Trogdor should handle injecting disk faults
- Add process stop faults, round trip workload, partitioned
  produce-consume test
- add the notion of max inflight requests to async ZookeeperClient
- Add workload generation capabilities to Trogdor
- Add ZooKeeperRequestLatencyMs to KafkaZkClient
- Use ZookeeperClient in LogManager
- Use ZookeeperClient in GroupCoordinator and TransactionCoordinator
- Use ZookeeperClient in KafkaApis
- Use ZookeeperClient in ReplicaManager and Partition
- Tests for KafkaZkClient
- Transient failure in
  \ 
kafka.api.SaslScramSslEndToEndAuthorizationTest.testTwoConsumersWithDifferentSas \ 
lCredentials
- minimize the number of triggers enqueuing
  PreferredReplicaLeaderElection events
- Enable dynamic reconfiguration of SSL keystores
- Enable resizing various broker thread pools
- Enable reconfiguration of metrics reporters and their custom configs
- Enable dynamic reconfiguration of log cleaners
- Enable reconfiguration of default topic configs used by brokers
- Enable reconfiguration of listeners and security configs
- Add ProduceBench to Trogdor
- move ZK metrics in KafkaHealthCheck to ZookeeperClient
- Add documentation for delegation token authentication mechanism
- Document dynamic config update
- Extend ConfigCommand to update broker config using new AdminClient
- Add test to verify markPartitionsForTruncation after fetcher thread
  pool resize

Files:
RevisionActionfile
1.6modifypkgsrc/devel/kafka/Makefile
1.5modifypkgsrc/devel/kafka/PLIST
1.6modifypkgsrc/devel/kafka/distinfo
1.2modifypkgsrc/devel/kafka/patches/patch-bin_connect-distributed.sh
1.2modifypkgsrc/devel/kafka/patches/patch-bin_connect-standalone.sh
1.3modifypkgsrc/devel/kafka/patches/patch-bin_kafka-run-class.sh
1.2modifypkgsrc/devel/kafka/patches/patch-bin_kafka-server-stop.sh
1.2modifypkgsrc/devel/kafka/patches/patch-config_server.properties