Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

There will be a system service to push our metadata changes to external metadata management system. This is an optional system service that can be enabled using cdap-site.xml and a pluggable external system can be chosen. For this work, the external system will be Navigator but in future we can support Apache Atlas. The system service will subscribe to Kafka topic to which metadata changes are published by the CDAP MetadataAdmin. These messages are then pushed to the external system - in case of Navigator we could use the Navigator SDK Java client. We will also have to use a system dataset to store the Kafka offset. Potential downside of this approach is that we will be consuming another valuable container resource in the cluster.
 

Additional Details:

Cloudera Navigator Primer 

Cloudera Navigator is a one shop stop for users of a CDH cluster to query/modify metadata information of various Hadoop entities. Navigator is also useful for Audits and Lineage but in this integration, we are simply focusing on Metadata part of it.
Here is a screenshot of how the metadata is displayed for a HDFS directory.

...