KAFKA REST Proxy – Publishing Avro Messages to Kafka
Apache Kafka supports message publishing through REST too. We can use Kafka REST proxy for this purpose. It supports http verbs including GET, POST and DELETE. Her is an example POST using curl which we will be trying to dissect throughout this post:
Required Schema
Apache Kafka messages are key / value pairs, which can both have their own schemas. Kafka REST Proxy supports publishing messages with their schema details. In the above case, the key is straight forward int specified as follows:
The value is also being published in avro format. Here we have a simple schema for a type having fields including studentId, studentName and height.
Headers
The request to publish data should include recommended HTTP headers. They are Content-type and Accept headers in this case.
Content-type
Here Embedded format is the format of data while Serialization format is the format of serialization of request. In our case, the data is in the avro format while request is serialized in json format.
Currently the only serialization format supported is json and allowed API versions are v1 and v2. The embedded format can be json, avro or binary.
Accept
Here you can specify the requirement for the response. Generally you specify the API version and serialization format expected. In you example, we are simply specifying the v1 version of API and json format.
Like other web api requests, it also supports content negotiation. It also supports specifying weight preferences as following:
Records
Kafka REST Proxy supports posting multiple records simultaneously by providing support for array for records field in the data.
Topic
Kafka REST proxy listens on port 8082 by default. The messages can be published to any topic specified in path of the request. Just make sure the topic already exists if the cluster is not enabled for auto-creation of topics (which should usually be the case in your production environment). Here is how we have specified our required end-point for Kafka REST proxy topic:
Response
As command is executed the response is returned. Since we are using curl, it would simply be printed to console. The response has details including the offset (plus partition) and key / value schema IDs.
Kafka REST is smart enough to not recreate the new schema if the same one is used to post corresponding messages.
Published Message format in Schema Registry
Since our messages are in Avro format. The schema can be saved in schema registry. Kafka REST proxy available with confluent packages automatically publishes the schemas to Schema registry. We can get the details of the stored schemas from /subjects endpoint of schema registry running on port 8081 by default
Here you can notice that it has created schema for bot key and value.
We can get the details of versions of key and value schemas directly from schema registry endpoints. Here are the curl commands and responses from schema registry:
Using Schema Ids to POST messages
We don’t need to specify schema details in every POST request. We can simply use the schema ids for key and value for subsequent HTTP posts. Here we are publishing to the same Student topic using key_schema_id and value_schema_id instead of key_schema and value_schema fields of the request.