Facets

Note: Make this page's parent CDAP once facets work for both streams and datasets.

CDAP exposes the API for developers to build their own plugin for parsing data in a Stream.
Developer should have the ability to build his own parser using the CDAP provided API for parsing events in the stream.
Developer/Operations should then have the ability to deploy the parser implemented into a directory with a configuration
User should specify at minimum a name and description for the plugin in a configuration
User should have the ability to list the available plugins using REST API / CLI
User should have the ability to view using REST API / CLI the pre-defined schema of the plugin in case the plugin defines one.
User should have the ability to list the views associated with a Stream using REST API / CLI / UI
User should have the ability to apply the plugin to a Stream and create a view
User specified view name should be registered in a catalog allowing one to query (SQL) using the view name.
User should have the ability to apply different plugins on the same Stream creating different view
User should have the ability to change the plugin associated with a view
CDAP should provide a text wrangler plugin that allows one to create rules for parsing mostly text files.

A facet is another place where data can be read, like streams and datasets.
- Therefore, facets are readable anywhere a stream or dataset is readable (MapReduce/Spark program, flows, ETL)
A facet is a read-only view of a stream or dataset, with a specific read format (schema + format (csv, avro))
If explore is enabled, then a Hive table will be created for each facet

Path

Request

Response

Notes

PUT /v3/namespaces/<namespace>/facets/<facet>

{
  "stream": "stream1",
  "format": <same as before>
}

Creates or modifies a facet.

GET /v3/namespaces/<namespace>/facets/<facet>

{"id":"someFacet", "stream": "stream1", "format": ..}

Get details of an individual facet.

GET /v3/namespaces/<namespace>/facets Lists all facets.

DELETE /v3/namespace/<namespace>/facet/<facet> Deletes a facet.

GET /v3/namespaces/<namespace>/stream/<stream>/facets

[
  {"id":"someFacet", "stream": "stream1", "format": ..},
  {"id":"otherFacet", "stream": "stream2", "format": ..}
]

Lists all facets associated with a stream.

Notes

Requirements

body
a,b,c
d,e,f

ticker	num_traded	price
a	b	c
d	e	f

ticker	price
a	c
d	f