Apache Flink – Collections & Streams While developing a streaming application, it is often necessary to use some inputs as collections. This can be specially useful for testing purposes. In this post we are trying to discuss how we can create a DataStream from a collection. After applying some operators on the stream, we are… Continue reading »
Custom Log4J Configuration file for Kafka In order to use a custom Log4J configuration file and to override the default one, we need to add the details in -Dlog4j.configuration. Here is how you can update the configuration path: As you can notice, we have also update the directory where logs are generated by overriding LOG_DIR.
Kafka-Consumer-Groups – Kafka CLI Tools Kafka-Consumer-Groups is a CLI tool which can be used to get the message consumption from Kafka. The tool can be used to get the list of topics and partitions consumed by a consumer group. It also details if a consumer is lagging in consumption of the messages. First we need… Continue reading »
Deleting Connectors from Kafka Connect A connector can be deleting using Kafka Connect’s REST API. Here is the curl command:
Kafka Streams – Resetting Application State I the previous post, we discussed that we might need to reprocess data during development during application development. Since Kafka Streams ensures the application state, it doesn’t pull and reprocess data. In this case, you might find yourself keep waiting for the join operations to get triggered, but to… Continue reading »
Building Serde for Kafka Streams Application While developing Kafka Streams applications, I found myself requiring some utility code over and over. One such code is to build Serde for custom types. It is better if it is refactored into separate types and used when needed. It has two methods specificAvroSerde and genericAvroSerde. It must be… Continue reading »
Kafka Tools – kafka.tools.GetOffsetShell GetOffsetShell can be used to get the last offsets of a topic or individual partitions of a topic. The tool is available with your Kafka installation. It can be run using kafka-run-class. Here we are getting the offsets for Merged-SensorsHeartbeat topic.
Kafka Tools – Mirror Maker MirrorMaker is a Kafka tools for copying data from one cluster to the other. In this post we are going to see how we can run Mirror Maker to copy data from one cluster to the other. This can be specially useful when we want to copy data between two… Continue reading »
Serde and Kafka Streams In order to push / pull messages from Kafka we need to use Serde. I have found that if you create a Serializer / Deserialzer like following then it becomes really useful to create the Serde for your types. You can also use the method to create a GenericSerde to serialize… Continue reading »
Kafka Streams Config’s General Properties In this post, I just need to add Kafka Stream configuration which I have to use over and over again for a Kafka streams application. This can be used when developing a new Kafka Streams application. First we need to introduce configurations for Stream. They include details of broker and… Continue reading »