記事

Data Mesh and the Watchmaker

By using the analogy of a watchmaker to better understand data mesh, we see data products in the context of gears, with each gear serving a unique purpose. Read more.

Dwayne Johnson
Dwayne Johnson
2021年12月21日 3 分で読める
Watchmakers as an analogy for data mesh

The Watchmaker analogy goes back 200+ years. It asks the question, if you were walking in the woods and found a watch laying on the ground, what would you think, “Did it grow organically or was there a designer?” You might examine it externally and be fascinated by the seconds, minutes, hours, days, weeks, months, moon phases and tides being displayed. You then might test its accuracy and marvel at its continuous precision. Then you might open it up and be even more amazed that the displays are all being orchestrated together via a set of gears – each integrated together and performing their purpose to track and display a portion of current time on the face. Most everyone would conclude there had to be an overall designer.

Let’s continue the watchmaker analogy with data mesh, specifically considering data products in the context of gears. Each gear has a unique purpose. For example, there is no need for two gears to keep track of seconds. Just as each gear is responsible for tracking its element of time, each data product is responsible for tracking its own set of data elements. Each watch gear has a specific shape and size. Within the data mesh, each data product has a specific shape and size, called its bounded context. This enables each data product to have specific purposes and to reduce (ideally eliminate) the need for two data products to manage the same data.

Each gear has cogs to enable it to interface with other gears and watch components, e.g., the mainspring. Similarly, data products have APIs (and abstracted views for co-located and connected data products) to interface with other data products and users.

A watch gear only needs to be concerned with the portion of time it is to keep track of and display. But what happens when we move to a different time zone? We need the ability to set the watch time to the new time zone. This can be easily done by spinning the crown of the watch forward or backward to the appropriate hour. All gears must work in concert to adjust all the dials and displays synchronously to reflect (aka cast) the appropriate time elements. Some watch users may care a great deal about moon phases and tides, while others may want to know the day of the week, but it all must be reflected accurately within the current time zone.

Again, there is a similar need within data mesh, the ability to recast data based on any given point-in-time. This happens when hierarchies change over time. The original planning is built from the original hierarchy, likely at a summary level. However, actuals are being tracked based on the current hierarchy, which can change over time. Many data products must have the ability to recast data, as needed by the business, to compare plans to actuals. If business users can only recast some data products within a point-in-time, while others they cannot, then the results are meaningless. Even simply casting data via current hierarchies can have issues when data products refresh data at different intervals, e.g., near real-time, intermittently, or nightly batch. There needs to be clear standards and best practices around change data capture (CDC), slowly changing dimensions (SCDs) and service level goals (SLGs). This needs to be baked into the design of all data products, enabling time flexibility and synchronicity for all.

The data mesh approach enables data products to continuously evolve over time to meet new business needs. Within the data mesh, data governance responsibilities are pushed down to the domain team to own and implement within their data products e.g., data quality, data integration, security, schema design and metadata. These concepts are clearly understood within organizations who have been doing centralized analytic data processing for years, providing enterprise data management via data stewards. As ownership moves to business areas via domain teams, they will need to ensure they have the business and technical skillsets to manage the bounded context of their data products and securely expose the required data to other domains and users.

Decentralization of data products can enable agility. However, done without a master design, chaos will ensue, and result in point solution sprawl. We can see the importance of the watchmaker role in providing global standards, polices and best practices. Domain teams must clearly know their roles and responsibilities, so the data products they build bring value to the data mesh community of users.

Tags

Dwayne Johnson について

Dwayne Johnson is a Principal Ecosystem Architect at Teradata, with over 20 years' experience in designing and implementing enterprise architecture for large analytic ecosystems. He has worked with many Fortune 500 companies in the management of data architecture, master data, metadata, data quality, security and privacy, and data integration. He takes a pragmatic, business-led and architecture-driven approach to solving the business needs of an organization.

Dwayne Johnsonの投稿一覧はこちら

最新情報をお受け取りください

メールアドレスをご登録ください。ブログの最新情報をお届けします。



テラデータはソリューションやセミナーに関する最新情報をメールにてご案内する場合があります。 なお、お送りするメールにあるリンクからいつでも配信停止できます。 以上をご理解・ご同意いただける場合には「はい」を選択ください。

テラデータはお客様の個人情報を、Teradata Global Privacy Policyに従って適切に管理します。