Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Type of dataset
  • Creation time - property
  • Last update time? - property
  • RecordScannable/BatchWritable/RecordWritable/BatchReadable
  • Other properties

Streams

  • Format
  • View

Schema as Metadata

Schema as metadata is meant to add the capability in CDAP for users to be able to retrieve datasets/streams with a field X optionally of type Y.

...

However, if stored as a separate dataset, the metadata system will have to manage two different datasets. APIs may need filters, etc - TODO: Details

Storing History - same pattern as Business Metadata

Runtime

System Metadata will be added/updated when:

  1. An app is deployed - We will add a SystemMetadataUpdater stage in the deployment pipeline that will update system metadata for the app, as well as all the programs in the app.
  2. A new dataset instance is created - The LineageWriterDatasetFramework can be extended to update system metadata when a dataset is added.
  3. A new stream is created - 

Deletes for all the above

System Metadata Updates

Only the CDAP system can update system metadata for entities. This capability will not be exposed to users. However, given this design choice, users will need a capability in CDAP to discover all the system tags/properties. To start off with, this can be exposed via a simple API that lists all tags/properties. It can later be extended via full-text search capabilities when CDAP has a more comprehensive search capability that extends beyond IndexedTables and prefix lookups.

...