Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Through the Program Configurers, users can create streams/datasets/register plugins etc in CDAP Programs (think more like local variables - create it when you need it)
  • Pros: Simplifies some applications code since create and use it only where it is needed. Simplifies ETLRealtime, ETLBatch applications. 
  • Cons: 
    • Streams/Datasets created created are NOT local variables since they are accessible to all applications/programs in that namespace
    • Logic to handle creation of streams/datasets (and possibly with different properties) in different  places -> should we disallow it or allow it as long as it has the same properties
    • Some ambiguous options in programs like Flows, Services, Workflows -> should we allow addition of streams/datasets in them or only in Flowlets/ServiceHandlers? (For example, stream connections are made only in Flows and not in Flowlets, even though Flows are just a collection of Flowlets)
    • WorkflowAction cannot support these changes since it uses builder pattern (there is a JIRA filed for this).
    • In some programs, certain features might not be useful. For example, creating a stream in a Service is useless since there is no way to use it programmatically in a Service/ServiceHandler
    • Have to add Datasets created in the configure method to useDatasets of Programs which don't use DynamicDatasetContext.
    • Streams/Datasets are local to a namespace but Plugins will be local to an application

 Assuming we go with option ii), we propose the following API changes:

...

  1. Worker:

    public class SimpleWorker extends AbstractWorker {

      @Override   
      public void configure() {
        createDataset(datasetName"etlrealtimestate", KeyValueTable.class);
        addStream(new Stream("hello"));
        addDatasetModule("abcModule", ABCModule.class);
        usePlugin("realtimesource", "kafka", "source", PluginProperties.EMPTY);
        usePlugin("realtimesink", "stream", "sink", PluginProperties.EMPTY);
      }
    }

  2. MapReduce:

    Very similar to what Worker looks like.

  3.  Flows and Flowlets: 

    public static class SimpleFlow extends AbstractFlow { 
      public void configureFlow() {
        addFlowlet("abc", new SimpleFlowlet());
        addStream(new Stream("hello"));
        connectStream("hello", "abc");
      }
    }

    public static class SimpleFlowlet extends AbstractFlowlet {
      public 
    }