Impressive data volumes:
over 90,000 messages per second
are processed by REWE in real-time.
REWE is a classic supermarket for most people. What do you do in the field of data processing?
Paul Puschmann: We operate in two worlds: On the one hand, we have the traditional brick-and-mortar retail with our stores and logistics centers. On the other hand, we are strongly positioned digitally with our online shop, delivery service, and our apps. Especially in the digital area, we process large amounts of data in real-time – from product information to orders to market data.
Impressive data volumes:
over 90,000 messages per second
are processed by REWE in real-time.
Why do you need Kafka for that?
Patrick Wegner: Kafka is like a digital nervous system for us. It connects our various services and enables data to arrive in real-time where it’s needed. When someone uses self-checkout in a store or places an order online, many systems need to work together in coordination. Kafka makes this possible.
What data volumes are we talking about?
Patrick Wegner: On just one of our clusters, we process up to 90,000 messages per second during peak times. And that’s just one of several clusters. What’s particularly impressive: Despite these enormous data volumes, the system runs very resource-efficient and stable.
About Paul Puschmann: Paul Puschmann has been working at REWE digital since 2014. As an IT Operations Engineer, he accompanied the development of the food delivery service and supported the integration of Apache Kafka. Today he works in the Cloud Center of Excellence and drives the company’s digital transformation forward.
About Patrick Wegner: Patrick Wegner works in the Integration Platform Team at REWE digital with a focus on Apache Kafka. He is responsible for operating and developing the Kafka infrastructure, which is a central component for REWE’s digital services such as online shop, apps, and self-checkout systems.
What does this mean concretely for customers?
Paul Puschmann: A good example is our mobile self-checkout: Customers can scan and pay for products themselves without queuing at the checkout. Behind this is a complex system of real-time data that is controlled via Kafka. Even when you order online, Kafka ensures that all involved systems – from inventory management to delivery – work together in coordination.
Why did you decide on Kafka?
Paul Puschmann: We started with Kafka in 2015 when we modernized our online shop. The big advantage of Kafka is its flexibility: We can develop new features and let old systems continue running in parallel. This enables us to modernize step by step without major system outages.
Sounds like a big technical challenge…
Patrick Wegner: Interestingly, Kafka itself is surprisingly "boring" – and I mean that positively. It just runs stable. The real challenge lies in the surrounding area: How do we make the system easy to use for developers? How do we design access rights? How do we ensure that Kafka is optimally deployed?
Kafka is like a digital nervous system for us. It connects our various services and enables data to arrive in real-time where it’s needed.
What role does Kafka play in your future IT strategy?
Paul Puschmann: Kafka is the bridge between the classical IT world and modern microservice architectures for us. We build new applications directly with Kafka. We migrate existing systems step by step when there’s a clear added value. This is a strategic architectural decision.
What has been the biggest challenge in operations so far?
Patrick Wegner: In the years of operation, we only had two notable technical problems: Once a faulty configuration entry in Zookeeper, caused by insufficient validation of an external tool. The second case was performance issues due to Java Garbage Collection. We were able to solve both cases through targeted configuration adjustments. This shows how fundamentally stable the system runs.
Proven stability:
Only 2 technical problems
in 10 years of Kafka operation.
What developments do you wish for in the future?
Patrick Wegner: Better native support for schema validation directly at the broker level would be very helpful. This would enable us to enforce constraints directly at the broker level instead of in upstream components. This way we could achieve better guarantees for data quality in our topics.
Paul Puschmann: A few years ago I would have said that I wish for better metrics, but that has now been fulfilled by very good metric exporters. What’s most important to me now: That Kafka maintains its previous strengths. We’ve been using the system since version 0.8, and the fact that we don’t have to rebuild our clients with every update is a huge advantage. This stability and backward compatibility should definitely be maintained.
The REWE Group is one of the leading retail and tourism groups in Germany and Europe. Founded in 1927, the REWE Group achieved total external sales of over 90 billion euros in 2023. With its over 380,000 employees, REWE combines traditional retail with the most modern digital services. The online shop, mobile self-checkout, and delivery service are examples of the company’s digital transformation.
And last but not least: You’ve known our Kafka trainings for several years. What convinces you about this training approach?
Patrick Wegner: The preferred type of training is very different from person to person - some people like frontal teaching, others prefer interactive learning. What I particularly like about working with Anatoly is exactly this interactive approach. This way of learning and knowledge transfer works very well for us.
Thank you for the interesting conversation