Kafka Tools – Mirror Maker
MirrorMaker is a Kafka tools for copying data from one cluster to the other. In this post we are going to see how we can run Mirror Maker to copy data from one cluster to the other. This can be specially useful when we want to copy data between two clusters.
Mirror Maker and Source and Destination Topics
You must remember that if a topic doesn’t already exist in the destination cluster, it should be automatically enabled if auto creation is enabled in your environment. Just check auto.create.topics.enable configuration for your broker. It is enabled by default. But in your production environment if you want to generate partitions as per your liking. This includes setting for number of partitions.
How to Install
If you are using Confluent packages, it should be part of the install. You can run it using Kafka-run-class.
Source Configurations
Mirror maker consumes data from source. Hence it’s configuration is passed in as a consumer config. Here is a sample consumer config I used. Here I am using bootstrap.servers to specify Kafka Brokers. Alternatively zookeeper.connect configuration can be used with its appropriate values.
auto.offset.reset is an important configuration if you want all existing data to be copied to the destination too. Please note that earliest is not the default value, which might be learnt as a surprise. Apparently there is already an issue about it.
https://issues.apache.org/jira/browse/KAFKA-4668
Destination Configuration
Mirror Maker would be producing records in the destination cluster. So its configuration is specified as producer config. Here is a sample producer configuration.
Running Mirror Maker
As we discussed above, it is very easy to run Mirror Maker tool. Either of whitelist or blacklist configuration is supported. These configurations can include comma-separated list of topics of interest.