Kafka in Automotive: The Solution for Exponentially Growing Data Traffic?

Modern vehicles produce enormous amounts of data. This challenges not only mobile networks but also the IT systems of manufacturers. How can Apache Kafka help?

Kafka in Automotive: The Solution for Exponentially Growing Data Traffic?

Recently, a meme went viral on social media. It showed two numbers and a picture of Tesla and X boss Elon Musk. On one side was the number of fully automated robo-taxis that Musk had promised the world in 2020. And on the other, the number of vehicles he had actually managed to deliver by summer 2023.

Small spoiler: The figures didn’t quite match up.

Apart from the fact that cars will become increasingly autonomous in the future, which requires enormous amounts of processed real-time data, the automotive sector is already facing significant challenges today. It’s no coincidence that one particular phrase has been heard frequently in recent years:

»Car manufacturers should no longer define themselves as car builders in the future. But as software companies.«

Industry Wisdom
Automotive Transformation

Yes, exactly. That’s right.

Data Challenges in the Automotive Sector

But let’s start with the present. There’s plenty to do there. Modern vehicles are already producing more and more data today due to rising expectations from vehicle owners.

The requirements include:

They want to control cars remotely, in the sense of: having access even from a distance.

Smartphones should replace keys. Or at least function as substitutes for keys.

Cars should be able to be lent out. Ideally uncomplicated and without key handover.

And yes: Eventually they should be able to drive themselves.

The growing data load created by these demands challenges mobile networks on one hand. Their operators can barely keep up with the adjustments. The situation is even more difficult with manufacturers' IT systems. They face many essential questions whose answers seem complex.

How can millions of vehicles send their data in real-time to users' apps?

How do real-time commands reach the vehicles?

How can manufacturers collect and store sensor data that they need for software training? Hashtag: autonomous driving.

The Significance of Apache Kafka

The solution is: Apache Kafka.

Although that’s not quite right in this case. Vehicles rarely communicate directly with Kafka. Instead, they use the specialized protocol MQTT. The vehicles send measurement data via MQTT to MQTT brokers. Using an MQTT adapter, the data is forwarded to Kafka. Come again?

Kafka MQTT Architecture
Figure 1. Kafka as an interface between MQTT and other applications

For explanation:

  • MQTT: Open network protocol for sending messages from machine to machine.

  • MQTT Broker: A broker is a server with which clients communicate. The broker receives client communications and forwards them to others.

  • MQTT Adapter: Links MQTT brokers and Kafka brokers. They enable bidirectional data transfer.

But what does this achieve on a technical level?

Thanks to Apache Kafka, the resulting data streams can be processed and stored efficiently. Whilst MQTT enables communication between vehicles and brokers, Kafka handles the data volumes arising from this exchange. Moreover: it preserves the information for the future, essentially making it durable. New data pipelines can be integrated. Entire systems become more scalable. In summary, this means: Kafka handles the processing, storage, and analysis of data from MQTT.

The advantage for drivers that emerges from this symbiosis is easy to understand. People can activate the auxiliary heating in their car from their own living room. And of course, vice versa from the car turn up the air conditioning at home.

Security Requirements in Data Handling

That sounds logical. Automotive manufacturers have another construction site besides the necessary IT infrastructure. It’s about data security. Companies must implement complex security measures.

  1. How do I ensure that customers only see their own data?

  2. How do you prevent cyber criminals from sending false data or, worse still, breaking into systems?

  3. For machine learning, the data must be anonymized. But how can this be achieved simply and quickly?

This is where it gets interesting. Kafka doesn’t come with many of these functions out of the box; development is required.

How Automotive Manufacturers Use Kafka

Quite specifically, automotive manufacturers use Kafka as follows:

  • Kafka can wonderfully buffer enormous amounts of data and send them to target systems. Fast, performant, and cost-effective.

  • Many teams from one company need to access this data simultaneously. Kafka provides the infrastructure for this.

  • Thanks to Kafka, the time teams need from idea to implementation is shortened. It’s significantly faster than with comparable messaging products.

Conclusion

Only when people’s data in cars is protected similarly to their own homes will the connected car have a thoroughly positive future.

About Anatoly Zelenin
Hi, I’m Anatoly! I love to spark that twinkle in people’s eyes. As an Apache Kafka expert and book author, I’ve been bringing IT to life for over a decade—with passion instead of boredom, with real experiences instead of endless slides.

Continue reading

article-image
Kafka in Banking: A Bridge Between Worlds – for Long-term Economical Projects

Core banking systems handle the most important processes in banking. The problem: These inflexible giants rarely harmonize with the wishes of today's customers. Systems are needed that connect the old with the new. Many banks rely on Apache Kafka for this. Why?

Read more
article-image
Logistics: How the Industry Can Withstand Growing Pressures with Kafka

Growing package volumes. Porous supply chains. Logistics companies face major challenges. They need to make the right decisions in real-time more than ever. However, this can't be achieved without real-time data. How Apache Kafka and its ecosystem help companies achieve this.

Read more