Data Mesh Newsletter #06
Too Long; Didn’t Skim:
1) More meetups are coming. We have HelloFresh on June 24th and Kolibri Games + Barr Moses on July 1st (both at 6:30pm CEST / 12:30pm EDT / 9:30am PDT)
Zhamak, Barr, and Lena Hall will also be doing a Twitter Spaces conversation on June 30th (6pm EDT / 3pm PDT / 8am AEST).
We also have tentatively scheduled meetups on July 8th and July 22nd.
2) We want to hear your story about why you are interested in data mesh. We’re doing a video interview series with all you wonderful folks, please participate! More info here.
3) Seeing an uptick in usage of the concept “data on the inside versus data on the outside” - domain data that is only consumed by the domain and isn’t part of the mesh versus data that is part of data products to be consumed externally. Brief primer here.
4) Free tickets to Apache Pulsar Summit care of DataStax here; more info below.
Interesting Data Mesh-Related Content
Building Data Platforms — The ETL bias
by João Vazao Vasques (Unbabel); Medium post
Great post on why specifically pipelines (the author uses ETL but seems to be pipelines specifically) are fundamentally problematic as you scale due to inherent brittleness. Zhamak has basically declared death to pipelines because of this fundamental flaw. Bring the tooling to the data as often as possible, don’t push the data around if you can help it.
My hack for getting started with data as a product
by Carlos Aguilar (Glean); Medium post
Good post on how to approach data as a product inside a company that is not organized around the concept yet. The framework for identifying potential data products is especially interesting re data mesh: if you have analysis that is rerun multiple times, consider automating it.
Product Thinking 101
by Naren Katakam (ThoughtWorks); Medium post
Provides a good framework for thinking about any software but there are some specific points very applicable to data product thinking; Jobs-to-be-Done is especially important - what do your data product consumers want to do with your data and how can you enable that? If you are looking for a good framework on internal interview questions to ask re data mesh, Intuit laid out their questions here.
Connecting Your Data Mesh with DataOps
by Christopher Bergh (DataKitchen; vendor); webinar (64min; no transcript)
Vendor webinar but gives a good beginning framework for how DataOps and data mesh can go hand-in-hand. From our perspective, the concept of DataOps is great - and is absolutely necessary - but data mesh needs to extend beyond the traditional responsibilities of DataOps to include a greater focus on the end data user experience (DUX).
Meetups and Other Upcoming Events
Data Mesh at HelloFresh - A work in progress - 6 senior members of the HelloFresh team will talk about their data mesh journey and how they are tackling some of data mesh’s biggest challenges (data literacy, data modeling, data products, etc.). Q&A afterwards.
June 24th at 6:30pm CEST / 12:30pm EDT / 9:30am PDT
Kolibri Games’ Data Mesh Journey & Calculating Data Mesh ROI with Barr Moses - António Fitas of Kolibri Games will share their data mesh journey so far including some pitfalls they have faced along the way. Full agenda to come. Barr Moses and Scott O’Leary of Monte Carlo will cover calculating the ROI of data mesh and whether data mesh is even right for your organization. Q&A afterwards.
July 1st at 6:30pm CEST / 12:30pm EDT / 9:30am PDT
(Non DML event) Data Mesh Panel Discussion with Zhamak Dehghani and Barr Moses - Lena Hall (Microsoft) will be hosting a Twitter spaces panel discussion with Zhamak and Barr. Anyone can listen in either through web or the Twitter app - the link will be available on any of their Twitter pages.
June 30th at 6pm EDT / 3pm PDT / Jul 1st 8am AEST
DataStax purchased 20 tickets to the Apache Pulsar Summit* for the data mesh community. Free ticket link here; regular conference link here; if tickets are gone by the time you see this, feel free to message email@example.com to see if there are any floating free tickets.
June 16-17th starting at 5:30pm CEST / 11:30am EDT / 8:30am PDT
*Apache Pulsar is a streaming technology similar to Apache Kafka with a focus on performance and easier rebalancing at scale; we are aware of a few companies using Pulsar for data mesh but they haven’t shared it publicly yet.
If you have questions/comments/concerns/suggestions for future newsletters, please let us know at firstname.lastname@example.org.
Special thanks to DataStax for allowing me to focus on this community and hopefully helping you all learn more about data mesh 😅