What is a Kafka Broker?

Learn via video courses
Topics Covered

What is a Kafka Broker?

The Kafka broker can be defined as one of the core components of the Kafka architecture. It is also widely known as the Kafka server and a Kafka node.

Kafka Broker As seen in the illustration above, the Kafka broker is used for managing the storage of the data records/messages in the topic. It can be simply understood as the mediator between the two. We define a Kafka cluster when there is more than one broker present. The Kafka broker is responsible for transferring the conversation that the publisher is pushing in the Kafka log commit and the subscriber shall be consuming these messages. The conversation is mediated between the multiple systems, enabling the delivery of the data records/ message to process to the right consumer.

While the Kafka cluster consists of various brokers. Kafka cluster implements the Zookeeper for maintaining the state of the cluster. It has also been seen that an individual broker can handle thousands of requests for reads and writes per second. When no performance impact is seen then every broker in the Kafka cluster can handle terabytes of messages. ZooKeeper also performs the broker leader election.

Kafka Broker is structured as a KafkaServer, that is hosting various topics. The stated topics must be partitioned across the various brokers spread in a Kafka cluster. The single broker is hosting the topic partitions for more than one topic, while a topic is partitioned for a single partition. Also, the Kafka producers push a message to a broker. Then the broker receives the data record and stores it. This stored data remains over the disk defined by a distinct offset. However, the partition, topic, and offset of a broker allow consumers to fetch messages.

Hence, the brokers could create the Kafka cluster by exchanging information with each other either directly or indirectly via the Zookeeper. It is one broker among all the brokers in a Kafka cluster that acts as the controller.

Transform Your Career

Choose from our industry-leading programs designed for career success

NSDC Certified

Modern Software and AI Engineering Program

Master full-stack development with AI integration

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

Modern Data Science and ML with specialisation in AI

Advanced data science techniques with AI specialization

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

Advanced AIML with Specialisation in Agentic AI

Deep dive into AIML with focus on Agentic systems

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

DevOps, Cloud & AI Platform Engineering

Build and manage AI-powered cloud infrastructure

12 MonthsDuration
AI-LedCurriculum
Career SupportSupport
GoogleAmazonPaytm+1000 more
Go to Program
NSDC Certified

AI Engineering Advanced Certification by IIT-Roorkee

Premier AI engineering certification from IIT-Roorkee

3 MonthsDuration
AI-LedCurriculum
Career SupportSupport
Program highlights
Go to Program

How to Start Kafka Broker?

It is with the Kafka-server-start.sh script, which can help the users to kick start the Kafka broker.

Steps:

  • Start the Zookeeper via the below command:
  • Once the Zookeeper is up and running, users are ready to start a Kafka server via the below command:
  • The kafka-server-start.sh starts the Kafka broker.

It is recommended that the Zookeeper is checked and validated as up and running before triggering the kafka-server-start.sh script. Utilize the zookeeper-server-start shell script.

The kafka-server-start.sh implements the config/log4j.properties for logging the configuration that could be overridden via the KAFKA_LOG4J_OPTS environment variable.

Kafka Command-line Options

Some of the Kafka Command-line Options are mentioned below:

Command-line OptionsDescriptions
–override property=valueRepresents the value that must be overridden for the value set of the property in the server properties file.
-daemonEnables the daemon mode.
-logicAutomatically enabled when Kafka is in daemon mode.
-nameUsed as defaults for representing the kafkaServer when it is in the daemon mode.

Turn Learning into Career Growth

1200+Hiring Partners
89%Placement Rate
11,000+Placements
147%Avg Salary Increment
2.5XCareer Growth
₹23 LPAAvg Post-Scaler Salary
1200+Hiring Partners
89%Placement Rate
11,000+Placements
147%Avg Salary Increment
2.5XCareer Growth
₹23 LPAAvg Post-Scaler Salary

What is Kafka Cluster?

As we already studied the Kafka broker acts as a mediator between the systems to smoothly transfer the message from the source to the destination.

When there is a band containing more than one broker working together that is termed the Kafka Cluster. The number of brokers in a Kafka Cluster contain can range from one to three or can even potentially contain hundreds of brokers. Organizations dealing with streaming data such as Netflix, Hotstar+, Ola, and Uber contain thousands of Kafka brokers for effectively managing and handling the data.

Now, you must be thinking about how can you identify a specific broker out of the range of brokers that a Kafka Cluster might contain. Well, any specific Kafka broker residing in a Kafka cluster could be recognized via its unique numeric ID.

Below is a crisp illustration of how a Kafka Cluster is made by five Kafka brokers.

five Kafka brokers

The Kafka cluster consists of multiple components, widely known as nodes, that together comprise the Kafka cluster. Various Kafka services such as Kafka Broker, Kafka Consumer, Kafka Producer, Zookeeper, etc are widely deployed to form a complete Kafka Cluster. Multiple functionalities like a failure, replication, data high availability, multiple partition support, etc are supported by it.

With these various brokers in a Kafka Cluster, the message is distributed over various instances. The Zookeeper plays a crucial role as a part of the Kafka cluster. It helps to synchronize, manage as well as handle the entire distributed configuration. Zookeeper also acts as the coordinator interface for the multiple Kafka brokers and consumers.

While the Kafka producer shall be pushing the message into the Kafka cluster, from where the message reaches the end of the Kafka cluster, where it could be easily read or consumed by the Kafka consumers.

You can understand more about it in the Kafka Cluster link, where we shall deep dive into the Kafka Cluster architecture as well.

How do Clients Connect to a Kafka Cluster (Bootstrap Server)?

Let us explore how a client connects with a Kafka Cluster. First, the client needs to connect with any Kafka broker to start sending or receiving the data records from the Kafka cluster. It is well known that each Kafka Broker residing in the Kafka cluster contains the metadata. This metadata contains information about the other Kafka brokers, which would eventually be utilized by the client to connect with the Kafka broker. Hence, any Kafka broker residing in the Kafka cluster is also widely known as a bootstrap server.

The metadata is returned via this bootstrap server to the specified client where the metadata includes the list of all the Kafka brokers in the Kafka cluster. With the help of this information, a client knows about each Kafka broker to connect to, send, or even receive data from. This list of information helps the client know the accurate data to find the specific Kafka brokers that serve their relevant topic-partition information.

Below is the pictorial representation of how a client connects with a Kafka Cluster: connects with a Kafka Cluster As a fault tolerance or security practice, the Kafka client always has the reference to at least two bootstrap servers as part of its connection URL. If any issue occurs, the Kafka client need not worry and can safely rely on the other Kafka broker to respond with the defined details. This also states that no Kafka client say, Developers, businesses, or even DevOps need to be aware of every single hostname for each of the Kafka broker residing in the Kafka Cluster. They only need to make sure that they are aware along with serves with two or three references in the connection string for the Kafka Clients.

Conclusion

  • Kafka Broker is structured as a KafkaServer, that is hosting various topics. The stated topics must be partitioned across the various brokers spread in a Kafka cluster.
  • "–override property=value" is a widely used CLI to represent the value that must be overridden for the value set of the property in the server properties file.
  • Kafka broker which acts as a mediator between the systems to smoothly transfer the message from the source to the destination.
  • The Kafka brokers could create the Kafka cluster by exchanging information with each other either directly or indirectly via the Zookeeper.
  • Any specific Kafka broker residing in a Kafka cluster could be recognized via its unique numeric ID.
  • It is recommended that the Zookeeper is checked if it is up and running.
Hiring Partners:
GoogleGoogleAmazonAmazonMicrosoftMicrosoftFlipkartFlipkartAdobeAdobe1200+ more