Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Jira Legacy
serverCask Community Issue Tracker
serverId45b48dee-c8d6-34f0-9990-e6367dc2fe4b
keyCDAP-2777

 

Goal : Introduce the ability in CDAP Programs to create streams, datasets or register the use of Plugins. 

Status Quo: Currently, we have the ability to create streams, datasets only at CDAP Application level. And Plugins can be registered only in Adapters (through AdapterConfigurer). So if we want to remove the concept of Application Templates and Adapters, we have couple of options:

i) Introduce the ability to register plugins in Application's configure method.

  • ETL Plugins need to have the ability to create streams/datasets. 
  • Pros: Don't need to depend on the programs to have the ability to add streams/datasets.
  • Cons: Not the intuitive place to include it. For example, in the case of ETLRealtime, ETLBatch applications, we use these features in ETLWorker/ETLMapReduce and so it would be more intuitive if we do this in programs instead of Applications.

ii)  Introduce the ability to create streams/datasets/register plugins in CDAP Programs:

  • Through the Program Configurers, users can create streams/datasets/register plugins etc in CDAP Programs (think more like local variables - create it when you need it)
  • Pros: Simplifies some applications code since create and use it only where it is needed. Simplifies ETLRealtime, ETLBatch applications. 
  • Cons: 
    • Streams/Datasets created NOT local variables since they are accessible to all applications/programs in that namespace
    • Logic to handle creation of streams/datasets (and possibly with different properties) in different  places -> should we disallow it or allow it as long as it has the same properties
    • Some ambiguous options in programs like Flows, Services, Workflows -> should we allow addition of streams/datasets in them or only in Flowlets/ServiceHandlers? (For example, stream connections are made only in Flows and not in Flowlets, even though Flows are just a collection of Flowlets)
    • WorkflowAction cannot support these changes since it uses builder pattern (there is a JIRA filed for this).

 Assuming we go with option ii), we propose the following API changes:

We will look at how the configure method of two program types will change:

 

  1. Worker: