Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

      • CDAP exposes the API for developers to build their own plugin for parsing data in a Stream.
      • Developer should have the ability to build his own parser using the CDAP provided API for parsing events in the stream.
      • Developer/Operations should then have the ability to deploy the parser implemented into a directory with a configuration
      • User should specify at minimum a name and description for the plugin in a configuration
      • User should have the ability to list the available plugins using REST API / CLI
      • User should have the ability to view using REST API / CLI the pre-defined schema of the plugin in case the plugin defines one.
      • User should have the ability to list the views associated with a Stream using REST API / CLI / UI
      • User should have the ability to apply the plugin to a Stream and create a view
      • User specified view name should be registered in a catalog allowing one to query (SQL) using the view name.
      • User should have the ability to apply different plugins on the same Stream creating different view
      • User should have the ability to change the plugin associated with a view
      • CDAP should provide a text wrangler plugin that allows one to create rules for parsing mostly text files.

Overview

  • A facet is a view of a stream or dataset, with a specific read format (schema + format (csv, avro))
  • There may be multiple facets per stream or dataset
  • If explore is enabled, then a Hive table will be created for each facet