Monthly Archives: October 2017

Confluent – Using off-the-shelf-connectors

Confluent – Using off-the-shelf-connectors with Landoop’s Kafka Connect UI

In this post we are going to see how easy it is to run a Kafka Connector in order to source data in Kafka using a well designed tool like Landoop’s Kafka Connect UI.

We can use Kafka Connect’s REST API to determine what connectors are available for our cluster. Here we are pulling this using /connector-plugins resource [http://localhost:8083/connector-plugins].

The same list is available in Kafka-connect-ui (discussed in an earlier post: Kafka-connect-ui). Here we are searching for file connectors. This is included as one of the default connectors available with the simple installation.

filestream-source-connector

Let’s provide the necessary configuration for FileStreamSourceConnector. This is used to pull data from a file into a kafka topic. Here we are pulling the contents of /tmp/students.csv file into students topic. As you can notice, we can easily pause, restart or delete the connector here.

connector-running

Here is how Landoop’s Kafka Connect UI Dashboard looks like:

Kafka Connect UI – Dashboard

We can also verify using Kafka Connect REST API. Here we are hitting the /connectors resource; which lists all the connectors currently running.

/connectors

In order to verify that the connector has actually loaded the file’s contents, we can use Kafka-topics-ui. We are simply pulling the docker image for the landoop’s tool and running it as follows:

Here we can select students topics and verify the data being pulled into this topic.

Connect Source data in Kafka Topics UI

Let’s update the contents of the file by adding some more contents. And we can see the items are correctly reflected in the topics.

topic-updated

Setting up Landoop’s Kafka-Connect-ui on Mac

In this post, we are going to setup the third of the series of tools by Landoop to manage Confluent services for Kafka. Landoop provides Kafka-connect-ui to manage connectors in Kafka-connect. The tools is available as a Docker image on DockerHub, so let’s first pull it on our machine.

docker pull landoop/kafka-connect-ui

Now we need to enable CORS. Just make sure that you update the properties file for Kafka Connect being used. I had to update the following file. Please note that I am running Confluent service locally on my MacBook using Confluent CLI.

Update connect properties

In order to update the CORS configs, just add the following lines to the property file. This should allow all requests from a different domain. In your production environment, you can be more restricted.

Now we can restart confluent services using Confluent CLI as follows:

Confluent stop
Confluent start

Finally we can run the docker image for Kafka-connect-ui. It must be remembered that you need to specify the actual IP address to avoid docker and host machine address mappings.

docker run –rm -p 8000:8000 -e “CONNECT_URL=http://192.168.1.101:8083” landoop/kafka-connect-ui

And here you can open it in the Browser. It shows up as follows for a single cluster setup:

kafka-connect-ui

Using Landoop’s Schema Registry UI

In this post, we are going to setup Schema Registry UI for Confluent’s schema registry using Docker Image for the tool. Schema Registry is an amazing tool by Landoop. It’s available in Github.

Let’s first pull the image from Docker Hub.

Schema Registry UI Docker Pull

Now we can simply run the docker image by forwarding the port forwarding. Plus we need to specify the URI for our schema registry.

schema-registry-ui-running

But as we run it and try to load the main page. We notice the error on the main page. Chrome’s Developer Tool should shed more light on the error. It’s actually CORS issue on our Schema Registry side.

Connectivity Error

So we need to enable CORS on our Schema Registry properties file. Let’s open /etc/schema-registry/ folder.

Just make sure that we restart the confluent services after this change. Actually it should be enough if we just restart the schema registry service.

confluent start

Now when we reload the page, the site loads just fine.

Installing Confluent Platform on Mac OS

There are no direct download instructions available on the confluent platform yet to install the packages on Mac OS. I just want to share it with others just in case someone might need some help with this.

Download the Archive

The zip packages can be downloaded from here: Download. After downloading unzip and copy it to a relevant folder.

Pre-requisite for Running

In order to run the platform, we need Java installed on our machine with version >= 7. Let’s first verify the version we have on my machine.

We also need to verify if the JAVA_HOME environment variable is correctly set on the machine. Additionally, the $PATH environment variable must be set, otherwise, the services wouldn’t just start.

Running Confluent CLI

Now we can simply run Confluent CLI from the bin folder of unzipped downloaded package. You should see the following help if you run it.

Confluent CLI

And finally all the services are running fine as we run it using confluent start.

Confluent Components & Their HTTP Ports

It is very important to remember the ports of individual components of Confluent Platform. The following is the list of default ports for the respective components. It must be remembered that these ports can be overridden during deployment.

Confluent components Ports

More details can be found on this page on confluent.io where this is copied from:

Kafka and Confluent logs in Log File

During development and debugging, it is very useful to see traces for Kafka and confluent in log file to determine the issues. The traces are specially useful when we use DSL for Kafka Streams. Here is the logback configuration to use logback appenders for kafka and confluent logs.

Log4J Dependency with Kafka

Log4J Dependency with Kafka

Many of the Kafka artifacts have Log4J dependencies. If you are using logback then your logging stops all of a sudden. Here is an example for one of our sample app. Here we have added such dependency and the App seems to have issues with multiple SL4J bindings. It has finally picked up Log4J instead.

If we look at this log, clearly the other binding is coming from Log4J. If we look at the Project tab, we can also see log4J artifact being added here.

log4j dependency

Here is our current build.sbt. It has “io.confluent” % “kafka-avro-serializer” % “1.0”, which is pulling up log4J as a dependency.

The solution is simple. We just need to tell SBT not to pull Log4J as dependency of this library.

And now we can see that the artifact for Log4J is removed from the external libraries.

log4j dependency exclusion

Packaging configurations outside Jars with SBT Native Plugin

In this post we are going to see how we can separately package any files (especially config and log configuration) out of packaged jar when we use sbt native packager for packaging our scala application. Here we are building from our last post.

In order to use sbt native packager; first we need to add the plugin in our packages configuration file (packages.sbt) under project folder.

Now we need to enable the plugin. Here JavaAppPackaging would package up jar files for our application. We can also notice a couple of mappings configurations. They copy application.dev.conf and logback.dev.xml files from resources folder to conf folder in the resulting package.

We have very simple configurations for our application.conf and application.dev.conf. We are simply keeping config.environment configuration. It has been assigned Debug and Dev values for debug and dev environments.

In order to demonstrate what configuration file is being used; here we are just pulling up config.environment configuration and logging it. Here we are using typesafe configs for handling with configuration files. If we simply run this code in IntelliJ debugger, you can notice that it pulls up application.conf file by default, hence writing Debug message to the logs.

debug-config

Let’s simply package it using sbt. Here we are running sbt universal:packageBin to package it.

Packaging

And this is how it packages it. You can notice that it has copied application.dev.conf and logback.dev.xml separately in conf folder.

separate-configs

Let’s simple run it using. We can notice that it has picked up application.conf file for configuration and has written Debug to the logs.

debug-config-used

We can override configuration file using -Dconfig.file argument. We can also override logback configuration file using -Dlogback.configurationFile argument.

But below we have just overridden the configuration file. You can notice that it has written Dev to the logs. This shows that it has correctly picked up application.dev.conf file as specified in the startup argument.

dev-config-used

avro-tools in Scala Projects – Don’t…

avro-tools in Scala Projects – Don’t…
In this post we are going to discuss how Avro-tools dependency can mess up your logging in a Scala project. Let’s create a sample Scala SBT project.

We update the build.sbt as follows:

Here are we adding dependencies for logback in addition to avro-tools dependency.

Let’s create a simple Main and add logging here.

Your editor (IntelliJ) might show you that there are multiple sources for LoggerFactory. Both jar files has the same package too. This should ring some alarms. You must remember that avro-tools is not adding org.sl4j as a dependency jar but it provides an implementation of LoggerFactory embedded into its own jar file. So you cannot use exclusion here.

multiple-implementations

Let’s also add configuration for logback to specify details of our required appenders.

But there is no log file created. Now just remove avro-tools dependency from your sbt file and now you can see the log file created. So, definitely avro-tools dependency has messed it up.

logFile

But you might have avro types added to your project. You wouldn’t be able to compile now as the required types are missing. But you don’t need the avro-tools here for this purpose. You can simply add dependency for “org.apache.avro” % “avro” % “1.8.2”, which is the latest version as of this blog post.

This should have no effect on your logging.