Overview
This page covers the requirements, design and implementation of metadata and data discovery features in 3.3
High Level Requirements
- Metadata search
- Schema as metadata
- System metadata
- CLI, Test Framework Support for metadata
- UI for Metadata Search
- UI for Lineage
- UI for Adding/Updating metadata properties/tags
- Lineage based on Type of Dataset Access
- Monitoring/Logs for Metadata Service
Scope
- Schema as metadata
- System metadata
- Metadata CLI
- Test Framework support for Metadata
- UI... (needs to be finalized)
User Stories
Id | Description | Requirements Fulfilled | Comments |
---|---|---|---|
U1 | As a user, I should be able to search Datasets containing the specified fields | List the kinds of queries that will be supported | |
U2 | As a CDAP system, I should be able to annotate CDAP entities with system metadata automatically | List all the system tags that should be annotated
| |
U3 | As a user, I should be able to access and update CDAP metadata using the CDAP CLI | ||
U4 | As a developer, I should be able to access and update CDAP metadata using the CDAP Test Framework | ||
U5 | As a user, I should be able to search CDAP entities based on metadata using the CDAP UI | ||
U6 | As a user, I should be able to view the lineage of a CDAP dataset/stream in a specified time window using the CDAP UI | ||
System Metadata
Kinds of system metadata:
Applications
- Artifact name
Programs
- Type of program
Datasets
- Type of dataset
- Creation time - property
- Last update time? - property
- RecordScannable/BatchWritable/RecordWritable/BatchReadable
- Other properties
Streams
- Format
Schema as Metadata
Questions