# Kafka Overview

⚡️ Tags: 📍Tools 📍Overview 📍Kafka

Kafka 101

# Why kafka fast?

Why kafka fast?

# Workflow

Kafka in a nutshell

# Architecture

Architecture

# Use cases

Kafka use cases

Kafka use cases

  • Log analysis - Log Aggregation
  • Data Streamming
  • System Monitoring & Alerting
  • Change Data Capture (CDC) - Data replication
  • System Migration - Message queuing

# Event sourcing

Kafka & event sourcing

# Loss message?

Loss message

# Producer

When we call producer.send() to send a message, it doesn't get sent to the broker directly. There are two threads and a queue involved in the message-sending process:

  1. Application thread
  2. Record accumulator
  3. Sender thread (I/O thread)

We need to configure proper ‘acks’ and ‘retries’ for the producer to make sure messages are sent to the broker.

# Broker

A broker cluster should not lose messages when it is functioning normally. However, we need to understand which extreme situations might lead to message loss:

  1. The messages are usually flushed to the disk asynchronously for higher I/O throughput, so if the instance is down before the flush happens, the messages are lost.
  2. The replicas in the Kafka cluster need to be properly configured to hold a valid copy of the data. The determinism in data synchronization is important.

# Consumer

Kafka offers different ways to commit messages. Auto-committing might acknowledge the processing of records before they are actually processed. When the consumer is down in the middle of processing, some records may never be processed.

A good practice is to combine both synchronous and asynchronous commits, where we use asynchronous commits in the processing loop for higher throughput and synchronous commits in exception handling to make sure the the last offset is always committed.

# Kafka vs RabbitMQ

Kafka vs RabbitMQ