./devel/kafka, Distributed streaming platform

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.1.0, Package name: kafka-1.1.0, Maintainer: pkgsrc-users

Kafka is used for building real-time data pipelines and streaming
apps. It is horizontally scalable, fault-tolerant, wicked fast,
and runs in production in thousands of companies.


Required to run:
[devel/zookeeper] [lang/openjdk11]

Required to build:
[pkgtools/cwrappers]

Master sites: (Expand)

Filesize: 49146.691 KB

Version history: (Expand)


CVS history: (Expand)


   2021-10-26 12:20:11 by Nia Alarie | Files touched by this commit (3016)
Log message:
archivers: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes

Could not be committed due to merge conflict:
devel/py-traitlets/distinfo

The following distfiles were unfetchable (note: some may be only fetched
conditionally):

./devel/pvs/distinfo pvs-3.2-solaris.tgz
./devel/eclipse/distinfo eclipse-sourceBuild-srcIncluded-3.0.1.zip
   2021-10-07 15:44:44 by Nia Alarie | Files touched by this commit (3017)
Log message:
devel: Remove SHA1 hashes for distfiles
   2020-05-27 21:37:44 by Thomas Klausner | Files touched by this commit (60)
Log message:
*: reset MAINTAINER for fhajny on his request
   2020-01-19 00:36:14 by Roland Illig | Files touched by this commit (3046)
Log message:
all: migrate several HOMEPAGEs to https

pkglint --only "https instead of http" -r -F

With manual adjustments afterwards since pkglint 19.4.4 fixed a few
indentations in unrelated lines.

This mainly affects projects hosted at SourceForce, as well as
freedesktop.org, CTAN and GNU.
   2019-11-03 11:39:32 by Roland Illig | Files touched by this commit (274)
Log message:
devel: align variable assignments

pkglint -Wall -F --only aligned --only indent -r

No manual corrections.
   2018-12-15 22:12:25 by Thomas Klausner | Files touched by this commit (67) | Package updated
Log message:
*: update email for fhajny
   2018-04-05 10:46:37 by Filip Hajny | Files touched by this commit (8) | Package updated
Log message:
devel/kafka: Update to 1.1.0.

New Feature

- automatic migration of log dirs to new locations
- KIP-145 - Expose Record Headers in Kafka Connect
- Add the AdminClient in Streams' KafkaClientSupplier
- Support dynamic updates of frequently updated broker configs

Improvement

- KafkaConnect should support regular expression for topics
- Move kafka-streams test fixtures into a published package
- SSL support for Connect REST API
- Grow default heap settings for distributed Connect from 256M to 1G
- Enable access to key in ValueTransformer
- Add "getAllKeys" API for querying windowed KTable stores
- Unify StreamsKafkaClient instances
- Revisit Streams DSL JavaDocs
- Extend Consumer Group Reset Offset tool for Stream Applications
- KIP-175: ConsumerGroupCommand no longer shows output for consumer
  groups which have not committed offsets
- Add a broker metric specifying the number of consumer group
  rebalances in progress
- Use Jackson for serialising to JSON
- KafkaShortnamer should allow for case-insensitive matches
- Improve Util classes
- Gradle 3.0+ is needed on the build
- Adding records deletion operation to the new Admin Client API
- Kafka metrics templates used in document generation should maintain
  order of tags
- Provide for custom error handling when Kafka Streams fails to
  produce
- Make Repartition Topics Transient
- Connect Schema comparison is slow for large schemas
- Add a Validator for NonNull configurations and remove redundant null
  checks on lists
- Have State Stores Restore Before Initializing Toplogy
- Optimize condition in if statement to reduce the number of
  comparisons
- Removed unnecessary null check
- Introduce Incremental FetchRequests to Increase Partition
  Scalability
- SSLTransportLayer should keep reading from socket until either the
  buffer is full or the socket has no more data
- Improve KTable Source state store auto-generated names
- Extend consumer offset reset tool to support deletion (KIP-229)
- Expose Kafka cluster ID in Connect REST API
- Maven artifact for kafka should not depend on log4j
- ConsumerGroupCommand should use the new consumer to query the log
  end offsets.
- Change LogSegment.delete to deleteIfExists and harden log recovery
- Make ProducerConfig and ConsumerConfig constructors public
- Improve synchronization in CachingKeyValueStore methods
- Improve Kafka GZip compression performance
- Improve JavaDoc of SourceTask#poll() to discourage indefinite
  blocking
- Avoid creating dummy checkpoint files with no state stores
- Change log level from ERROR to WARN for not leader for this
  partition exception
- Delay initiating the txn on producers until initializeTopology with
  EOS turned on

Bug

- change log4j to slf4j
- Use RollingFileAppender by default in log4j.properties
- Cached zkVersion not equal to that in zookeeper, broker not
  recovering.
- FileRecords.read doesn't handle size > sizeInBytes when start is not
  zero
- a soft failure in controller may leave a topic partition in an
  inconsistent state
- Cannot truncate to a negative offset (-1) exception at broker
  startup
- automated leader rebalance causes replication downtime for clusters
  with too many partitions
- kafka-run-class has potential to add a leading colon to classpath
- QueryableStateIntegrationTest.concurrentAccess is failing
  occasionally in jenkins builds
- FileStreamSource Connector not working for large files (~ 1GB)
- KeyValueIterator returns null values
- KafkaProducer is not joining its IO thread properly
- Kafka connect: error with special characters in connector name
- Replace StreamsKafkaClient with AdminClient in Kafka Streams
- LogCleaner#cleanSegments should not ignore failures to delete files
- Connect Rest API allows creating connectors with an empty name -
  KIP-212
- KerberosLogin#login should probably be synchronized
- Support replicas movement between log directories (KIP-113)
- Consumer ListOffsets request can starve group heartbeats
- Struct.put() should include the field name if validation fails
- Clarify handling of connector name in config
- Allow user to specify relative path as log directory
- KafkaConsumer should validate topics/TopicPartitions on
  subscribe/assign
- Controller should only update reassignment znode if there is change
  in the reassignment data
- records.lag should use tags for topic and partition rather than
  using metric name.
- KafkaProducer should not wrap InterruptedException in close() with
  KafkaException
- Connect classloader isolation may be broken for JDBC drivers
- JsonConverter generates "Mismatching schema" DataException
- NoSuchElementException in markErrorMeter during
  TransactionsBounceTest
- Make KafkaFuture.Function java 8 lambda compatible
- ThreadCache#sizeBytes() should check overflow
- KafkaFuture timeout fails to fire if a narrow race condition is hit
- DeleteRecordsRequest to a non-leader
- ReplicaFetcherThread should close the ReplicaFetcherBlockingSend
  earlier on shutdown
- NoSuchMethodError when creating ProducerRecord in upgrade system
  tests
- Running tools on Windows fail due to typo in JVM config
- Streams metrics tagged incorrectly
- ducker-ak: add ipaddress and enum34 dependencies to docker image
- Kafka cannot recover after an unclean shutdown on Windows
- Scanning plugin.path needs to support relative symlinks
- Reconnecting to broker does not exponentially backoff
- TaskManager should be type aware
- Major performance issue due to excessive logging during leader
  election
- RecordCollectorImpl should not retry sending
- Restore and global consumer should not use auto.offset.reset
- Global Consumer should handle TimeoutException
- Reduce rebalance time by not checking if created topics are
  available
- VerifiableConsumer with --max-messages doesn't exit
- Transaction markers are sometimes discarded if txns complete
  concurrently
- Simplify StreamsBuilder#addGlobalStore
- JmxReporter can't handle windows style directory paths
- CONSUMER-ID and HOST values are concatenated if the CONSUMER-ID is >
  50 chars
- ClientQuotaManager threads prevent shutdown when encountering an
  error loading logs
- Streams configuration requires consumer. and producer. in order to
  be read
- Timestamp on streams directory contains a colon, which is an illegal
  character
- Add methods in Options classes to keep binary compatibility with
  0.11
- RecordQueue.clear() does not clear MinTimestampTracker's maintained
  list
- Selector memory leak with high likelihood of OOM in case of down
  conversion
- GlobalKTable never finishes restoring when consuming transactional
  messages
- Server crash while deleting segments
- IllegalArgumentException if 1.0.0 is used for
  inter.broker.protocol.version or log.message.format.version
- Using standby replicas with an in memory state store causes Streams
  to crash
- Issues with protocol version when applying a rolling upgrade to
  1.0.0
- Fix system test dependency issues
- Kafka Connect requires permission to create internal topics even if
  they exist
- A metric named 'XX' already exists, can't register another one.
- Improve sink connector topic regex validation
- Flaky Unit test:
  KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegionWithNonZeroByteCache
- Make KafkaStreams.cleanup() clean global state directory
- AbstractCoordinator not clearly handles NULL Exception
- Request logging throws exception if acks=0
- GlobalKTable missing #queryableStoreName()
- KTable state restore fails after rebalance
- Make loadClass thread-safe for class loaders of Connect plugins
- System Test failed: ConnectRestApiTest
- Broken symlink interrupts scanning the plugin path
- NetworkClient should not return internal failed api version
  responses from poll
- Transient failure in
  NetworkClientTest.testConnectionDelayDisconnected
- Line numbers on log messages are incorrect
- Topic can not be recreated after it is deleted
- mBeanName should be removed before returning from
  JmxReporter#removeAttribute()
- Kafka Core should have explicit SLF4J API dependency
- StreamsResetter should return non-zero return code on error
- kafka-acls regression for comma characters (and maybe other
  characters as well)
- Error deleting log for topic, all log dirs failed.
- punctuate with WALL_CLOCK_TIME triggered immediately
- Exclude node groups belonging to global stores in
  InternalTopologyBuilder#makeNodeGroups
- Transient failure in
  \ 
kafka.api.AdminClientIntegrationTest.testLogStartOffsetCheckpointkafka.api.Admin \ 
ClientIntegrationTest.testAlterReplicaLogDirs
- NetworkClient.inFlightRequestCount() is not thread safe, causing
  ConcurrentModificationExceptions when sensors are read
- ConcurrentModificationException during streams state restoration
- Update KStream JavaDoc with regard to KIP-182
- RocksDB segments not removed when store is closed causes
  re-initialization to fail
- auto commit not work since coordinatorUnknown() is always true.
- Fix StateRestoreListener To Use Correct Batch Ending Offset
- NullPointerException on KStream-GlobalKTable leftJoin when
  KeyValueMapper returns null
- StreamThread.shutdown doesn't clean up completely when called before
  StreamThread.start
- Update ZooKeeper to 3.4.11, Gradle and other minor updates
- output from ensure copartitioning is not used for Cluster metadata,
  resulting in partitions without tasks working on them
- Consumer should not block setting initial positions of unavailable
  partitions
- Non-aggregation KTable generation operator does not construct value
  getter correctly
- AdminClient should handle empty or null topic names better
- When enable trace level log in mirror maker, it will throw null
  pointer exception and the mirror maker will shutdown
- Simplify KStreamReduce
- Base64URL encoding under JRE 1.7 is broken due to incorrect padding
  assumption
- ChangeLoggingKeyValueBytesStore.all() returns null
- Fetcher.retrieveOffsetsByTimes() should add all the topics to the
  metadata refresh topics set.
- LogSemgent.truncateTo() should always resize the index file
- Connect: Plugin scan is very slow
- Connect: Some per-task-metrics not working
- Connect header parser incorrectly parses arrays
- Java Producer: Excessive memory usage with compression enabled
- New Connect header support doesn't define 'converter.type' property
  correctly
- ZooKeeperClient holds a lock while waiting for responses, blocking
  shutdown
- Transient failure in
  DynamicBrokerReconfigurationTest.testThreadPoolResize
- Broker leaks memory and file descriptors after sudden client
  disconnects
- Delegation token internals should not impact public interfaces
- Streams quickstart pom.xml is missing versions for a bunch of
  plugins
- Deadlock while processing Controller Events
- Broker doesn't reject Produce request with inconsistent state
- LogCleanerManager.doneDeleting() should check the partition state
  before deleting the in progress partition
- KafkaController.brokerInfo not updated on dynamic update
- Connect standalone SASL file source and sink test fails without
  explanation
- Connect distributed and standalone worker 'main()' methods should
  catch and log all exceptions
- Consumer bytes-fetched and records-fetched metrics are not
  aggregated correctly
- Coordinator disconnect in heartbeat thread can cause commitSync to
  block indefinitely
- Regression in consumer auto-commit backoff behavior
- GroupMetadataManager.loadGroupsAndOffsets decompresses record batch
  needlessly
- log segment deletion could cause a disk to be marked offline
  incorrectly
- Delayed operations may not be completed when there is lock
  contention
- Expression for GlobalKTable is not correct
- System tests do not handle ZK chroot properly with SCRAM
- Fix config initialization in DynamicBrokerConfig
- ReplicaFetcher crashes with "Attempted to complete a transaction
  which was not started"

Test

- Add concurrent tests to exercise all paths in group/transaction
  managers
- Add unit tests for ClusterConnectionStates
- Only delete reassign_partitions znode after reassignment is complete
- KafkaStreamsTest fails in trunk
- SelectorTest may fail with ConcurrentModificationException

Sub-task

- Add capability to create delegation token
- Add authentication based on delegation token.
- Add capability to renew/expire delegation tokens.
- always leave the last surviving member of the ISR in ZK
- handle ZK session expiration properly when a new session can't be
  established
- Streams should not re-throw if suspending/closing tasks fails
- Use async ZookeeperClient in Controller
- Use async ZookeeperClient in SimpleAclAuthorizer
- Use async ZookeeperClient for DynamicConfigManager
- Use async ZookeeperClient for Admin operations
- Trogdor should handle injecting disk faults
- Add process stop faults, round trip workload, partitioned
  produce-consume test
- add the notion of max inflight requests to async ZookeeperClient
- Add workload generation capabilities to Trogdor
- Add ZooKeeperRequestLatencyMs to KafkaZkClient
- Use ZookeeperClient in LogManager
- Use ZookeeperClient in GroupCoordinator and TransactionCoordinator
- Use ZookeeperClient in KafkaApis
- Use ZookeeperClient in ReplicaManager and Partition
- Tests for KafkaZkClient
- Transient failure in
  \ 
kafka.api.SaslScramSslEndToEndAuthorizationTest.testTwoConsumersWithDifferentSas \ 
lCredentials
- minimize the number of triggers enqueuing
  PreferredReplicaLeaderElection events
- Enable dynamic reconfiguration of SSL keystores
- Enable resizing various broker thread pools
- Enable reconfiguration of metrics reporters and their custom configs
- Enable dynamic reconfiguration of log cleaners
- Enable reconfiguration of default topic configs used by brokers
- Enable reconfiguration of listeners and security configs
- Add ProduceBench to Trogdor
- move ZK metrics in KafkaHealthCheck to ZookeeperClient
- Add documentation for delegation token authentication mechanism
- Document dynamic config update
- Extend ConfigCommand to update broker config using new AdminClient
- Add test to verify markPartitionsForTruncation after fetcher thread
  pool resize
   2018-03-07 12:50:57 by Filip Hajny | Files touched by this commit (3)
Log message:
devel/kafka: Update to 1.0.1.

Improvement
- Kafka metrics templates used in document generation should maintain
  order of tags
- Consolidate MockTime implementations between connect and clients
- Fix repeated words words in JavaDoc and comments.
- Cache lastEntry in TimeIndex to avoid unnecessary disk access
- AbstractIndex should cache index file to avoid unnecessary disk
  access during resize()
- Have State Stores Restore Before Initializing Toplogy
- SSLTransportLayer should keep reading from socket until either the
  buffer is full or the socket has no more data
- Improve KTable Source state store auto-generated names

Bug
- QueryableStateIntegrationTest.concurrentAccess is failing
  occasionally in jenkins builds
- KafkaProducer is not joining its IO thread properly
- Kafka connect: error with special characters in connector name
- Streams metrics tagged incorrectly
- ClassCastException in BigQuery connector
- ClientQuotaManager threads prevent shutdown when encountering an
  error loading logs
- Streams configuration requires consumer. and producer. in order to
  be read
- Timestamp on streams directory contains a colon, which is an illegal
  character
- Add methods in Options classes to keep binary compatibility with
  0.11
- RecordQueue.clear() does not clear MinTimestampTracker's maintained
  list
- Selector memory leak with high likelihood of OOM in case of down
  conversion
- GlobalKTable never finishes restoring when consuming transactional
  messages
- IllegalArgumentException if 1.0.0 is used for
  inter.broker.protocol.version or log.message.format.version
- Using standby replicas with an in memory state store causes Streams
  to crash
- Issues with protocol version when applying a rolling upgrade to
  1.0.0
- A metric named 'XX' already exists, can't register another one.
- Flaky Unit test:
  KStreamKTableJoinIntegrationTest.shouldCountClicksPerRegionWithNonZeroByteCache
- AbstractCoordinator not clearly handles NULL Exception
- Request logging throws exception if acks=0
- KTable state restore fails after rebalance
- Make loadClass thread-safe for class loaders of Connect plugins
- Inconsistent protocol type for empty consumer groups
- Broken symlink interrupts scanning the plugin path
- NetworkClient should not return internal failed api version
  responses from poll
- Topic can not be recreated after it is deleted
- mBeanName should be removed before returning from
  JmxReporter#removeAttribute()
- Connect: Struct equals/hashCode method should use Arrays#deep*
  methods
- kafka-acls regression for comma characters (and maybe other
  characters as well)
- punctuate with WALL_CLOCK_TIME triggered immediately
- Update KStream JavaDoc with regard to KIP-182
- StackOverflowError in kafka-coordinator-heartbeat-thread
- Fix StateRestoreListener To Use Correct Batch Ending Offset
- NullPointerException on KStream-GlobalKTable leftJoin when
  KeyValueMapper returns null
- Non-aggregation KTable generation operator does not construct value
  getter correctly
- When enable trace level log in mirror maker, it will throw null
  pointer exception and the mirror maker will shutdown
- Enforce layout of dependencies within a Connect plugin to be
  deterministic
- Connect: Some per-task-metrics not working
- Broker leaks memory and file descriptors after sudden client
  disconnects

Test
- KafkaStreamsTest fails in trunk
- SelectorTest may fail with ConcurrentModificationException