By   December 27, 2018

com.esotericsoftware.kryo.KryoException: Unusual solution upgrading Flink

While upgrading from Flink from 1.4 to a newer version 1.6.1, there are a few build issues. After we fix the issues, suddenly we started getting KryoException. Why are we getting this and how it this related to the upgrade? Let’s start with the exception message. If you try to start the job a few times and you have a few Source tasks, you would find it failing at different stages each time. That is based on the data received by Source tasks from topic. Here we are using Kafka source in Flink.

The biggest issue was the code ran fine in IntelliJ but as we run it in a cluster, it throws this error.

And restarting the job, it started throwing error another schema missing.

There are a number of solutions recommended by different developers and some of them have worked in certain cases.

Disabling Kryo fallback
Flink allows us to disable Kryo fallback. Here are the details:
https://ci.apache.org/projects/flink/flink-docs-master/dev/execution_configuration.html

But this didn’t work for us. We still kept getting the same errors as we run the job.

Solution

The solution was not to package the flink runtime libraries with the fat jar package built using sbt assembly. We can do that by adding dependencies as follows: