...
Another possibility was to store the real key value in a separate table and the indexes in the indexedTable which will avoid the empty column values for a row but this will lead to 6 tables on total (3 for system and business each) hence we have decided against it.
...
In addition to above goals we also plan to do the following:
Metadata Search Results:
- CDAP-4274 - Metadata search should returns the metadata of matching entities ( Open)
- Also return some other relevant info. Please see details below.
Search Result
Metadata search will return Entities with the following details depending upon the type of the Entity.
Entity Type Search Details Note Application Type
Name Metadata: Tags and Properties App Description Program Type If Type=Workflow then also show all program under the workflow Name Metadata: Tags and Properties App it belongs to Artifact Type Name Dataset Type Name Stream Name Type View Name Type Stream Name
...
Design Decision:
- In the search result of entity we will return all the metadata for that entity too.
Open Question:
- Please suggest other things which we can add to different search result entities ?
...
Emit more metadata from system entities:
Here is a list of System Metadata which we are planning to emit from different entities. If you have any suggestions as what other info can be useful as system metadata please comment below.
Artifacts
- Version
Applications
- ArtifactId
- Plugins
- Plugin Type
- Plugin Name
- Schedule
- Programs
Programs
- Type: Flow, MapReduce etc
- Workflow
- Nodes under this workflow
- Mode: Batch, Realtime
- Type: Flow, MapReduce etc
Datasets
- Schema
- RecordScannable/BatchWritable/RecordWritable/BatchReadable
- Type: KVTable, FileSet etc
- ttl
Streams
- Schema
- ttl
Views
- Schema
Open Questions:
- Please suggest other things which we can add to different system metadata entries
- Nitin Motgi: Can we call "business metadata" "user metadata" and also the table which stores it userMetadata table rather than business to keep it consistent with other stuff like metrics etc.