Articles

The Log: What every software engineer should know about real-time data’s unifying abstraction | LinkedIn Engineering; the origins.

Gently Down the Stream — a gentle introduction to apache Kafka using illustrations.

Hosted Kafka: Upstash: Serverless Data for Redis® and Kafka®.

Why Apache Kafka doesn’t need fsync to be safe — Jack Vanlightly; Apr 2023.

Kafka vs Redpanda Performance - Do the claims add up? — Jack Vanlightly; May 2023. Seven part series.

Kafka is dead, long live Kafka - WarpStream WarpStream is an Apache Kafka® protocol compatible data streaming platform built directly on top of S3. It’s delivered as a single, stateless Go binary so there are no local disks to manage, no brokers to rebalance, and no ZooKeeper to operate. WarpStream is 5-10x cheaper than Kafka in the cloud because data streams directly to and from S3 instead of using inter-zone networking, which can be over 80% of the infrastructure cost of a Kafka deployment at scale.

See tisonkun/morax: Message queue and data streaming based on cloud native services. for an “Open Source Warpstream”

Use Cases and Architectures for Kafka at the Edge - Kai Waehner — “Single Broker Kafka”.

Kafka Zero Copy

Deep dive into Apache Kafka storage internals: segments, rolling and retention; 2021.

Libraries

Apache Beam — unified programming model for batch and streaming data processing pipelines that simplifies large-scale data processing dynamics.

Jikkou —  open-source framework that enables self-serve resource provisioning. It allows developers and DevOps teams to easily manage, automate, and provision all the resources needed for their Apache Kafka platform. Describe the entire desired state of any resource you need to manage using YAML descriptor files.

Tools

via

  • kafka connect

  • kafka streams

  • schema registry

  • REST proxy

  • cruise control “continuously monitors the cluster’s metrics and automatically reacts based on that to alleviate things has to be there”

  • ksqlDB - streaming Kafka over SQL.

  • Strimzi operator - Sets up Kafka and associated components in Kubernetes

  • kafka-ui — view schemas and connectors, while also allowing you to manage multiple Kafka clusters.

  • Kroxylicious | Network proxy framework for Apache Kafka. “Topic encryption, policy-enforcement, multi-tenancy, audit and much more.” —

  • kadeck | ‍Kafka UI

Kafka-like

Iggy.rs “is the persistent message streaming platform written in Rust, supporting QUIC, TCP (custom binary specification) and HTTP (regular REST API) transport protocols. Currently, running as a single server, it allows creating streams, topics, partitions and segments, and send/receive messages to/from them. The messages are stored on disk as an append-only log, and are persisted between restarts. The goal of the project is to make a distributed streaming platform (running as a cluster), which will be able to scale horizontally and handle millions of messages per second.”