This article is an edited version of my talk at the Kafka Utrecht Meetup 2026. Thanks to Axual for hosting the meetup and the recording.
Kafka is fantastic. It’s fast, scalable, and reliable. But Kafka also has two traits that can make your life difficult: Kafka is lazy and Kafka doesn’t care.
Kafka treats messages as pure byte arrays. It doesn’t care what’s in those bytes. That’s good for performance, but bad for data quality. If a producer sends garbage, your consumers get delivered that garbage. The result is crashing consumers and long debugging sessions.
And then there’s the issue of security. If you run Kafka in a public cloud or process sensitive data, transport encryption is often not enough. You want to ensure that the data is also encrypted – and in such a way that not even the cloud provider can read it.
This is where Kroxylicious comes in. Kroxylicious is an open-source Kafka proxy that sits between your clients and the Kafka cluster. It understands the Kafka protocol and can analyze, validate, and modify messages.
In this article, we’ll look at two concrete use cases:
Record Validation: How to prevent invalid data from ever reaching your cluster.
Record Encryption: How to transparently encrypt data without your applications having to implement complex key management.
Before we start, let’s set up a small test environment. If you don’t have a Kafka environment yet, read our Kafka local test environment article.
First, let’s download Kroxylicious (we’re using version 0.18.0 here, but feel free to check the latest version on the project page):
wget https://github.com/kroxylicious/kroxylicious/releases/download/v0.18.0/kroxylicious-app-0.18.0-bin.tar.gz
tar -xzf kroxylicious-app-0.18.0-bin.tar.gz
mv kroxylicious-app-0.18.0 kroxylicious
cd kroxylicious
Next, we create a configuration file. We tell Kroxylicious to listen on port 9292 and forward all requests to our local Kafka broker on 9092.
config.yamlvirtualClusters:
- name: demo
targetCluster:
bootstrapServers: localhost:9092
gateways:
- name: mygateway
portIdentifiesNode:
bootstrapAddress: localhost:9292
nodeIdRanges:
- name: brokers
start: 0
end: 3
filterDefinitions: []
Make sure your nodeIdRanges match your Kafka configuration. If your broker IDs start at 1, adjust start accordingly.
Now let’s start the proxy:
./bin/kroxylicious-start.sh --config config.yaml
If everything worked, you now have a proxy running. Your Kafka clients can now decide: connect directly to Kafka (localhost:9092) or go through Kroxylicious (localhost:9292).
Imagine you have a producer sending JSON data. A developer in another team (definitely not you 😉) makes a mistake and the producer suddenly sends invalid JSON.
What happens without validation?
Kafka accepts the message (it’s just a byte array after all).
Your consumer reads the message.
The JSON parser in the consumer throws an exception.
The consumer crashes and gets stuck in a retry loop.
Schema Registries are designed to solve this problem. But the checks happen only on the client side. A misconfigured producer can still produce garbage.
With Kroxylicious, we can prevent this. We configure a Record Validation Filter that ensures only syntactically correct JSON lands in the wind-turbine-data topic.
To do this, we adapt the config.yaml and add the filter definition:
config.yamlfilterDefinitions:
- name: record-validation
type: RecordValidation
config:
rules:
- topicNames:
- wind-turbine-data
valueRule:
syntacticallyCorrectJson:
validateObjectKeysUnique: true
allowNulls: true
allowEmpty: false
defaultFilters:
- record-validation
To keep the example simple, we are only checking if the JSON is syntactically correct. Of course, Kroxylicious also supports Schema Validation. But that would require setting up a Schema Registry, which would be too complex for this example.
After restarting Kroxylicious, here’s what happens when a producer tries to send faulty JSON:
echo "I am not a valid JSON" | kafka-console-producer.sh \
--bootstrap-server localhost:9292 --topic wind-turbine-data
Result:
org.apache.kafka.common.InvalidRecordException:
Value was invalid: value was not syntactically correct JSON:
Unrecognized token 'I': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
Interesting: The Kafka protocol doesn’t know this error message. But it supports returning text error messages. Kroxylicious uses this capability to return the error to the producer.
The producer gets immediately an error message (InvalidRecordException). The message never lands in the Kafka log. Your consumer continues running peacefully because it only receives clean data.
Encryption is often a nightmare. Client-side encryption means you have to deal with key management, rotation, and distributing keys to all teams. If you do it wrong, your data is either insecure or lost forever.
Kroxylicious can act as an Encryption Gateway. It uses Envelope Encryption and integrates with Key Management Systems (KMS) like HashiCorp Vault, AWS KMS, or Azure Key Vault.
For this example, we’ll use HashiCorp Vault in Dev Mode (please don’t do this in production!).
wget https://releases.hashicorp.com/vault/1.18.0/vault_1.18.0_linux_amd64.zip
unzip vault_1.18.0_linux_amd64.zip
./vault server -dev
We need to configure Vault so that Kroxylicious is allowed to create and use keys. (You can find the exact policies and commands in the Kroxylicious documentation, but essentially we allow operations on the path transit/keys/KEK_*).
export VAULT_ADDR='http://127.0.0.1:8200'
./vault secrets enable transit
Then we create a policy for Kroxylicious:
kroxylicious_encryption_filter_policy.hclpath "transit/keys/KEK_*" {
capabilities = ["read"]
}
path "/transit/datakey/plaintext/KEK_*" {
capabilities = ["update"]
}
path "transit/decrypt/KEK_*" {
capabilities = ["update"]
}
And apply the policy:
./vault policy write kroxylicious_encryption_filter_policy kroxylicious_encryption_filter_policy.hcl
Finally, we create a token for Kroxylicious:
./vault token create -display-name "kroxylicious record encryption" \
-policy=kroxylicious_encryption_filter_policy \
-period=768h \
-no-default-policy \
-orphan \
-format=json | jq -r .auth.client_token > kroxylicious-encryption-token
For every topic we want to encrypt, we need to create a Key Encryption Key (KEK):
./vault write -f transit/keys/KEK_my_encrypted_topic type=aes256-gcm96 auto_rotate_period=90d
We add the RecordEncryption filter.
config.yamlfilterDefinitions:
- type: RecordEncryption
name: record-encryption
config:
kms: VaultKmsService
kmsConfig:
vaultTransitEngineUrl: http://localhost:8200/v1/transit
vaultToken:
passwordFile: /path/to/kroxylicious-encryption-token
selector: TemplateKekSelector
selectorConfig:
template: "KEK_${topicName}" # If the topic is named `my_encrypted_topic`, we want to name the KEK `KEK_my_encrypted_topic`.
defaultFilters:
- record-encryption
What happens now?
Your Producer sends plaintext to Kroxylicious (localhost:9292).
Kroxylicious encrypts the message content before forwarding it to Kafka (localhost:9092).
Your Consumer, reading through Kroxylicious, receives the data automatically decrypted.
The trick: If you (or an attacker) listen directly on the Kafka broker (localhost:9092), you only see gibberish:
kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic my_encrypted_topic --from-beginning
CAC_my_encrypted_topic......
This means your data is securely encrypted At Rest in the Kafka cluster. The Kafka broker itself never sees the plaintext.
Is this End-to-End Encryption or not?
That depends. If Kroxylicious is under the control of the Producer (e.g., as a sidecar), I would call it End-to-End Encryption. If Kroxylicious is operated by the Kafka team, it’s merely At-Rest Encryption – the Kafka team could decrypt the data if they wanted to, but at least if Kafka were run in the cloud, the cloud provider could not read the data.
Kroxylicious allows us to use and implement server-side functions that Apache Kafka itself does not support. The price for this is an additional hop in the network and a potential single point of failure.
Especially with increased security requirements, it makes sense not to implement encryption yourself, but to offload this functionality to tested software.
Kroxylicious can also be a useful solution for communicating with external partners. This way, we can ensure that data always arrives in the correct format.
Tip: Before using Kroxylicious, please check whether you really need it. It’s not a solution for every problem.
Want to ensure that not a single Kafka message gets lost? In this guide, I'll show you how to achieve maximum reliability when producing with the right configurations and what to do in case of errors.
Read more
In this post, you'll learn how explicit schemas help you avoid potential chaos in Kafka and how schema registries support this.
Read more