...
- Type of dataset
- Creation time - property
- Last update time? - property
- RecordScannable/BatchWritable/RecordWritable/BatchReadable
- Other properties
Streams
- Format
- View
Schema as Metadata
Schema as metadata is meant to add the capability in CDAP for users to be able to retrieve datasets/streams with a field X optionally of type Y.
...
However, if stored as a separate dataset, the metadata system will have to manage two different datasets. APIs may need filters, etc - TODO: Details
Storing History - same pattern as Business Metadata
Runtime
System Metadata will be added/updated when:
- An app is deployed - We will add a SystemMetadataUpdater stage in the deployment pipeline that will update system metadata for the app, as well as all the programs in the app.
- A new dataset instance is created - The LineageWriterDatasetFramework can be extended to update system metadata when a dataset is added.
- A new stream is created -
Deletes for all the above
System Metadata Updates
Only the CDAP system can update system metadata for entities. This capability will not be exposed to users. However, given this design choice, users will need a capability in CDAP to discover all the system tags/properties. To start off with, this can be exposed via a simple API that lists all tags/properties. It can later be extended via full-text search capabilities when CDAP has a more comprehensive search capability that extends beyond IndexedTables and prefix lookups.
...