...
- User should be able to search key-value metadata with the following or with its prefix:
- key-value
- key with part of value
- value
- Individual words in the value
Example:
User stores a key-value metadata with key = "Codename" and value = "Alpha Tango Charlie" for an entity - User should be able to search for this entity with the following queries:
- key-value
- Codename: Alpha Tango Charlie
- Codename: Alpha Tang*
- key with part of value
- Codename: Alpha
- Codename: Tango
- Codename: Charlie
- Codename: Alp*
- value
- Alpha Tango Charlie
- Alpha*
- Alpha Tan*
Design DecisionNote:- We have decided not to support searches for queries which have parts of value for example "Tango Charlie". You can either search for whole value or with prefix or single words (we plan to tokenize on whitespace)
- parts of value
- Alpha
- Tango
- Charlie
- Alph*
- Tan*
- Ch*
- key-value
- User should be able to search tags metadata with the following or with its prefix:
- tags key and a tag value
- a tag value
Example:
User tags an entity with the following tags "Tag1, Tag22" - User should be able to search for this entity with the following queries:
- tag key and a tag value:
- tags: Tag1
- tags: Tag*
- a tag value
- tag22
- tag2*
- tag key and a tag value:
======================================================================================================================
Use Cases:
- Key-Value Metadata:Codename: Alpha Tango Charlie
- User should be able to search with
- Whole Key-Value (complete or prefix)
- Codename: Alpha Tango Charlie
- Codename: Alpha Tang*
- Key with Part of Value (complete or prefix)
- Codename: Alpha
- Codename: Tango
- Codename: Charlie
- Codename: Alp*
- Whole Value (complete or prefix):
- Alpha Tango Charlie
- Alpha*
- Alpha Tan*
Design Decision:- We have decided not to support searches for queries which have parts of value for example "Tango Charlie". You can either search for whole value or with prefix or single words (we plan to tokenize on whitespace)
Parts of value (complete or prefix): - Alpha
- Tango
- Charlie
- Alph*
- Tan* Ch*
- fieldname
- fieldname scoped with schema - this should limit the search to just schema fields and not other metadata
User should be able to search
with- With tags key and a tag value (complete or prefix):
- tags: Tag1
- tags: Tag*
With tag value (complete or prefix): - tag22 tag2*
for all entities with which has a schema
Example:
A dataset has the following schema:Code Block title Nested Schema { "EmpName": "String", "EmpContact": { "EmpTel": "Integer", "EmpAddr": "String" } }
Use case: User should be able to search for this dataset with
FieldNamethe following queries:
- fieldname:
- EmpName
- EmpContact
- EmpTel
- EmpAddr
- Emp*
- fieldname scoped to schema
- :
- schema: EmpName
- schema: EmpContact
- schema: EmpTel
- schema: EmpAddr
- schema: Emp*
- fieldname:
- FieldName (complete or prefix):
- EmpName
- EmpContact
- EmpTel
- EmpAddr
- Emp*
- search for all entities with a schema
- schema:*
- This will return this dataset entity and also all the other entities which have schema stored as their metadata
- This will return this dataset entity and also all the other entities which have schema stored as their metadata
- We don't plan to support schema searches with fieldType.
- If a user searched with a query which is not scoped with schema by default it will search for schema fields besides the normal key-value and tags.
Open questions:- What if an entity has multiple schema (ex: transform which has input and output schema)
- Maybe We can index its fields with input and output schema and we expect an user to specify whether they are looking for something in input schema or output schema.
- What about entities which have more than one schema?
- What if an entity has multiple schema (ex: transform which has input and output schema)
- Maybe we can store them either as input output with identifier.
- How will an user search for a fieldName across input and output schema ?
- One way is to besides indexing the fields as input and output schema we also index every field as just schema so that we can perform such queries.
- One way is to besides indexing the fields as input and output schema we also index every field as just schema so that we can perform such queries.
Tags Metadata:tags: Tag1, Tag22
Use case: search for entities (datasets, streams, views) through field-names in schema with the following or with its prefix:
Design Decisions
New Design:
Storage:
Metadata Storage Format:
...