More Articles from This Series
More Articles from This Series
"Data is the oil of the 21st century."
That’s now a worn-out phrase. And it also leads to a misconception.
We think that having data is enough. Yet data alone is of little value if its quality isn’t right and employees view sharing information within the company as an annoying side requirement.
We must understand: The success of promising data management isn’t based solely on systems and structures. It’s also a cultural question. It requires understanding what information can bring. It helps if we imagine data as a product.
For us to successfully implement the Data Mesh we need to reframe the term "data". Data should not be seen as waste or a burdensome responsibility that has to be managed. It is an asset, a tool that makes our own work easier.
One reason organizations fail at this cultural question is a familiar phenomenon. Pat Helland calls it in his paper Data on the Outside vs Data on the Inside: "My Data" vs. "Your Data."
Or put differently: Service-Internal Data vs. Service-External Data.
What’s the difference?
Service-Internal Data is stored so that the service can get the most out of the data. There’s strong coupling between the service and the data, which is good in this case. The data is specifically tailored to this one application.
Service-External Data is intended for data exchange. The data is rather loosely coupled. We achieve this by not tailoring it to a specific application, but keeping it as general as possible.
Now the problem: When developing services, those involved often only think about internal data. At some point, one realizes that others also need access to the data. What happens then? The requesters are given access to the internal data without considering whether that’s even appropriate.
It would be better, however, if teams could share information sensibly, securely, and without special effort. Because team-external data is a core asset within a company.
If we grasp that particularly this data is also a product that one can develop and continuously improve, we make life easier for everyone involved in our own company.
As a foundation, there are six principles:
Data as a product must be …
discoverable: That is: Other teams must be able to find the data. And for that, they must know that the data exists at all.
addressable: It’s clear where the data is located. Mix-ups are ruled out.
trustworthy: The data has high quality.
self-describing: Ideally, data is structured so that teams can understand it independently without the help of experts.
interoperable: The data should be prepared in a format so that it’s also compatible with other data sources.
secure: It must be safe to use the data. For this, internal security standards must be maintained.
With this checklist, the path to "Data as a Product" is far from complete. But it helps to understand the fundamental basics.
The "Data Mesh" is a real hype in the IT. Why companies benefit from a decentralised data architecture and how Apache Kafka helps establish this new structure is discussed in this article.
Read moreFrom start-up to corporation – more and more companies are adopting microservice architectures. In this post, you'll learn how companies use Apache Kafka to simplify communication between their services.
Read more