Dispatches by Deepak Alur

Share this post

Where's your metadata? Part 3

deepakalur.substack.com

Where's your metadata? Part 3

Exploring open-source metadata systems

Deepak Alur
Nov 3, 2021
Share this post

Where's your metadata? Part 3

deepakalur.substack.com

In my earlier post, I shared that I narrowed my exploration of open-source metadata systems to two options - Open Metadata (from ex-Uber folks) and Datahub (from LinkedIn).

Dispatches by Deepak Alur
Where's your metadata? Part 2
As I continue my research into different metadata systems, I want to share my experiences and commentary along the way. My exploration of metadata systems is actually a part of my overall exploration of the modern data stack. During my career, I have been involved in building several platforms and data products. I feel that discussions on data platforms …
Read more
a year ago · Deepak Alur

I want to share another article by Julia Valenti from Cisco. She did an excellent job sharing her insights here recently. I hope she writes a similar article on Datahub as well for comparison.

See: Why OpenMetadata is taking the right approach to metadata cataloging

Julia observes the three main things we should care about in any metadata system - Schema-based, APIs and Lineage. At a high level, both these systems support them all but have differences.

On APIs, note that Datahub has a rich set of GraphQL APIs, while Open Metadata has a rich collection of REST APIs. What is more important other than GraphQL or conventional REST APIs is the actual features supported by these APIs that allow your developers to use and integrate with ease to derive maximum value.

On lineage, as Julia notes, everyone is working on improving lineage information. In my opinion, this is where a "standard" can come into play at some point. I hope the metadata community converges and collaborates to establish some standards among all these emerging systems. I, as a user, would want that from these systems. It becomes essential for interoperability and portability, but I am getting ahead of myself here.

Metadata aside, I am getting curious about the actual data access management. Sure we get great information on data from metadata systems. How then, when tools or programs access the underlying datasets/data sources, do the metadata systems make it easier? How and where should we implement "access control" to the datasets/data sources? Where do these access control policies reside, and where do they get enforced? What is the role of an API gateway concerning metadata systems?

I believe that metadata plays a significant role in policy definitions for access control and management. If you have opinions or pointers for these areas, please share.

Share this post

Where's your metadata? Part 3

deepakalur.substack.com
Comments
TopNew

No posts

Ready for more?

© 2023 Deepak Alur
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing