Data Mesh Newsletter #09
Jul 21 21 Edition
The high level:
0) Phenomenal article on building a successful data mesh, HIGHLY recommend - long but worth it.
1) Our JPMC meetup was awesome, recording here. Lots of very useful insights. Good write-up of the meetup by Wikibon here.
2) Had a blast hosting a panel-style meetup on what is a data product in data mesh; recording here, slides here, short post by Adevinta on the topic here.
3) More meetups coming:
Jul 22 All about data versioning (key to data products)
Jul 29 CMC Markets (their QCon Plus talk had the best insights re driving internal buy-in of any content we’ve seen)
Aug 5 Saxo Bank (they’ve put out a lot of great data mesh content recently here and here); agenda coming soon
4) Thank you to Charlotte Ledoux of the governance-focused startup Vallai for being our first person to record a “Person of the Data Mesh” video. It’s awesome and only 5min, check it out here. Please submit your own (community@datameshlearning.com)!
5) Useful Twitter thread by John Cutler that is wildly applicable to data mesh. Essentially, please do not try to copy another company’s implementation verbatim. It’s why we are doing meetups, so people can see what questions the company asked and how they answered; then you can ask and answer those questions re your own organization.
6) Looking for collaborators in OSS for data mesh; want to have an umbrella for people to share data APIs, sample notebooks w/ dummy data, sample documentation, data harmonization specifications, etc. More info coming soon-ish. Going to be called DataMOSS (Data Mesh OSS). Think about what you wish you could have seen an example of when implementing part of your data mesh. Be a pal and open source what you can so others don’t have the same pains. No need to support it or do anything more than let people see an example.
Interesting Data Mesh Related Content
Building a successful Data Mesh – More than just a technology initiative
by Nazia Shahrin (RBC/University of Toronto); LinkedIn post (long but worth it)
Simply put, the best data mesh written piece not by Zhamak we’ve seen (in our opinion). It dives into how you would actually approach implementing a data mesh at a large organization. E.g. what are the different decision points to make re governance, standardization/harmonization, data privacy, etc. It covers all the pillars well and feels like it could be 3x as long and still be engaging. (Just read it, trust us!)
Data as a product vs data products. What are the differences?
by Xavier Gumara Rigol (Adevinta); Medium post on Towards Data Science blog
Good short article on the difference between data products (broad industry term) versus data as a product/data product thinking-led data products in data mesh. One other thing to note is data products in data mesh are “data on the outside” but in other contexts, data products are often “data on the inside” (inside/outside topic explained here).
Data Lake Strategy via Data Mesh Architecture at JPMorgan Chase (DML Meetup)
James “JR” Reid, Sarita Bakst, and Arup Nanda; video (84min)
Fantastic meetup with a CIO, a Chief Information Architect, and the Head of Cloud Data at JPMC. They covered a wide range of topics including how they organize their domains into one high-level product with a number of sub-products and how they are pushing governance down to the domains. Their ingesting and routing setup is super interesting too. Good write-up of the meetup by Dave Vellante of theCUBE/Wikibon here.
Enabling Data Discovery in a Data Mesh: The Saxo Journey
by Sheetal Pratik (Saxo Bank); Medium post on DataHub blog
Saxo Bank describes how they leverage OSS tools DataHub and Great Expectations to make their data discoverable and maintain lineage, data quality, etc. but with a “light touch” governance from the central Data Office. Governance had been a bottleneck but the team feels they are moving faster now. Data discoverability continues to be a challenge we hear about with most data mesh implementations.
Anatomy of a Data Product in Data Mesh (DML Meetup)
Mohammad Syed (Credera), Pedro Castillo (HelloFresh), and Scott Hirleman (DML); video (76min)
Meetup re the concept of just what is a data product in the data mesh world. We covered a high-level definition, some recommended practices, the “permission” from Pedro to literally start with one giant data product in a domain that is just an Excel workbook (gotta start somewhere!), the architectural pieces of a data product, etc. Overall, a fun conversation and hopefully useful/educational.
Two Podcasts on Data Mesh (The Evolution of Data Architecture: Moving to a Data Mesh and Data Mesh: A Deeper Dive)
by Andrew Wolfe and Fahad Shoukat (Skiplist); podcasts (31min and 31min)
Two older podcasts (Aug 2020 and Mar 2021) but show an evolution in thinking around data mesh and some interesting insights. The first is mostly an explanation of data mesh in an approachable way and the second included a great discussion about size of data products (data nodes on the mesh) - DML disagrees with their conclusions but it is still a great topic up for debate.
Upcoming Meetups / Zhamak Stuff
DML Meetups (6:30pm CEST / 12:30pm EDT / 9:30am PDT / 10pm IST unless noted):
Jul 22: The Cost of Choosing to Not Version Your Data - Paul Singman of Treeverse (lakeFS) will cover data product versioning from a data + code standpoint relative to data mesh. Gavin Medel-Gleason of TerminusDB will cover data product schema versioning and what that means in data mesh. Q&A afterwards.
Jul 29: Data Mesh at CMC Markets: Past, Present and Future - Tareq Abedrabbo, Lorenzo Nicora, and Michał Stypik from the core engineering team will cover CMC Markets’ journey with data mesh so far. They are somewhat unique in that they were already decentralized but were facing issues with data siloes. They have great insights across topics including driving buy-in. Q&A afterwards.
Aug 5: Data Mesh at Saxo Bank (4pm CEST / 7:30pm IST / 10am EDT / 7am PDT) - Sheetal Pratik (see her post above re data discovery) will present on how Saxo Bank is approaching data mesh with a focus on how they are leveraging data mesh with an event-driven architecture. Much more to come. Q&A afterwards.
September and October speaking slots open! Contact Scott if you want to share your journey!
Other Data Mesh Meetups/Webinars/Training
Zhamak-Led Training (4 days, 4hrs each day, Sept 20, 21, 27, 28); 15% discount link makes it 1,657.50 Euro (we don’t understand VAT) - early bird pricing ends soon. The last training was phenomenally helpful, especially re deriving domains, in our opinion.
Data mesh and domain ownership webinar; Zhamak and Danilo Sato (ThoughtWorks); Aug 4 at 5pm CEST / 8:30pm IST / 11am EDT / 8am PDT
Data Mesh and Governance webinar; Zhamak, Chris Ford, and Jason Hare (ThoughtWorks); Aug 26 at 6pm CEST / 9:30pm IST / noon EDT / 9am PDT
Data Platform in a mesh architecture webinar; Zhamak and Emily Gorcenski (ThoughtWorks); Oct 14 at 5:30pm CEST / 9:0pm IST / 11:30am EDT / 8:30am PDT
/Newsletter
If you have questions/comments/concerns/suggestions for future newsletters, please let us know at community@datameshlearning.com.
Special thanks to DataStax for allowing me to focus on this community and hopefully helping you all learn more about data mesh 😅