Data Mesh Newsletter #10
The quick bit
1) Seeing a real need for the concept of blueprints for data products in data mesh. Essentially, something a domain team can use to easily build out 90%+ of the infrastructure pieces of a data product. Most data products don’t need super custom processes, transformations, input ports, or output ports so a quick scaffolding is very useful. Any good content on this? Anyone willing to share their blueprints publicly?
2) We are launching a panel talks series in collaboration with the Open|Source|Data Podcast. The first panel will be on Data Discovery and hosted by the amazing Paco Nathan of Derwen.ai (LinkedIn|Twitter) and will feature Shinji Kim (SelectStar), Shirshanka Das (Acryl Data/DataHub), Mark Grover (Stemma.ai/Amundsen), and Sophie Watson (Red Hat). August 20 (6:30pm CEST / 12:30pm EDT). Send ideas for future panels to Scott
3) Interesting article on potential data mesh topologies here. Highly recommend thinking through it as the first pattern doesn’t fully align with data mesh but good food for thought.
5) DataMOSS efforts starting forward. People want to see your examples (APIs, blueprints, output ports, data product notebooks, etc.). If you want to throw on GitHub to share, awesome, send us links to share. If you want to “throw over the wall” and not deal with running an OSS project/repo, get in touch and we will host as “EXAMPLE ONLY”. 😎 Let’s make it the default to share some of what you are doing!
Interesting Data Mesh Related Content
Data Mesh topologies
by Piethein Strengholt (Microsoft); article
Very interesting and thought provoking article. While we do not agree that the “Governed Mesh Topology” really aligns with a pure approach to data mesh, it might be a stepping stone along the way. You could start to think about how you would deploy sensitive data with governance into other regions/countries so the cross-regional consumption costs and latency do not become an issue.
Data Mesh at CMC Markets: Past, Present and Future (DML Meetup)
by Tareq Abedrabbo, Lorenzo Nicora, and Michał Stypik; video (86min)
Great presentation by CMC Markets team re their evolution towards data mesh, including some really good insights about driving buy-in, metadata management, and more.
Note: per Zhamak’s vision (as said in the community Slack) data mesh data products should be read only; also, there is an emerging pattern of operational-use data in some data mesh data products but analytical data must always be a first class concern
Architect’s Open-Source Guide for a Data Mesh Architecture
by Lena Hall (Microsoft); video (31min)
A good overview of data mesh as well as many of the high-level technology requirements companies will need to implement data mesh. Also includes a number of technologies/solutions for solving those needs from many companies. Not a vendor pitch 😉
Capgemini Data-Powered Innovate Wave 2 (not gated; PDF)
An Engineering Approach to Data Mesh (by Ron Tolido - Capgemini and Stephen Brobst - Teradata) - one article of many, starts on page 24
Suggests an interesting approach to making data products more easily queryable/interoperable: use a common data fabric (query fabric is probably a better terminology) in all your data products. Can’t say we’ve seen this in any deployments yet but makes logical sense at first pass. As a reminder, using data fabric queries/data virtualization AS your data product (so pulling data live from operational systems) is a VERY STRONG anti-pattern in data mesh. Might be useful for prototyping/initial “data product marketing” but is likely to cause issues very quickly.
The Cost of Choosing to Not Version Your Data (DML Meetup)
by Gavin Medel-Gleason (TerminusDB) and Paul Singman (Treeverse); video (62min)
Gavin and Paul walked us through the different types of versioning data and how they can solve a number of interesting problems. Important topic when designing data products and especially as they evolve. Should be a keystone when discussing contracts for your data products.
Upcoming Meetups / Zhamak Stuff
Aug 5: Data Mesh at Saxo Bank (4pm CEST / 7:30pm IST / 10am EDT / 7am PDT) - Sheetal Pratik (see her post above re data discovery) will present on how Saxo Bank is approaching data mesh with a focus on how they are leveraging data mesh with an event-driven architecture. Much more to come. Q&A afterwards.
Aug 12: Break
Aug 19: Financial Services Co (To be Announced)
Aug 20 (Panel): Data Discovery Panel hosted by Paco Nathan of Derwen.ai (LinkedIn|Twitter) and will feature Shinji Kim (SelectStar), Shirshanka Das (Acryl Data/DataHub), Mark Grover (Stemma.ai/Amundsen), and Sophie Watson (Red Hat). 10pm IST / 6:30pm CEST / 12:30pm EDT / 9:30am PDT
Aug 26: Building a Data Platform for Data Mesh at Flexport
Sept 2: Delhivery
Sept 9: Still figuring this one out
Sept 16: AO
Sept 23: Overstock.com
Sept 30: Adevinta
October speaking slots open! Contact Scott if you want to share your journey!
Other Data Mesh Meetups/Webinars/Training
Zhamak-Led Training (4 days, 4hrs each day, Sept 20, 21, 27, 28); 15% discount link makes it 1,657.50 Euro (we don’t understand VAT) - early bird pricing ends soon. The last training was phenomenally helpful, especially re deriving domains, in our opinion.
Data mesh and domain ownership webinar; Zhamak and Danilo Sato (ThoughtWorks); Aug 4 at 5pm CEST / 8:30pm IST / 11am EDT / 8am PDT
Data Mesh and Governance webinar; Zhamak, Chris Ford, and Jason Hare (ThoughtWorks); Aug 26 at 6pm CEST / 9:30pm IST / noon EDT / 9am PDT
Data Platform in a mesh architecture webinar; Zhamak and Emily Gorcenski (ThoughtWorks); Oct 14 at 5:30pm CEST / 9:0pm IST / 11:30am EDT / 8:30am PDT
If you have questions/comments/concerns/suggestions for future newsletters, please let us know at firstname.lastname@example.org.
Special thanks to DataStax for allowing me to focus on this community and hopefully helping you all learn more about data mesh 😅