...
- Through the Program Configurers, users can create streams/datasets/register plugins etc in CDAP Programs (think more like local variables - create it when you need it)
- Pros: Simplifies some applications code since create and use it only where it is needed. Simplifies ETLRealtime, ETLBatch applications.
- Cons:
- Streams/Datasets created created are NOT local variables since they are accessible to all applications/programs in that namespace
- Logic to handle creation of streams/datasets (and possibly with different properties) in different places -> should we disallow it or allow it as long as it has the same properties
- Some ambiguous options in programs like Flows, Services, Workflows -> should we allow addition of streams/datasets in them or only in Flowlets/ServiceHandlers? (For example, stream connections are made only in Flows and not in Flowlets, even though Flows are just a collection of Flowlets)
- WorkflowAction cannot support these changes since it uses builder pattern (there is a JIRA filed for this).
- In some programs, certain features might not be useful. For example, creating a stream in a Service is useless since there is no way to use it programmatically in a Service/ServiceHandler
- Have to add Datasets created in the configure method to useDatasets of Programs which don't use DynamicDatasetContext.
- Streams/Datasets are local to a namespace but Plugins will be local to an application
Assuming we go with option ii), we propose the following API changes:
...
- Worker:
public class SimpleWorker extends AbstractWorker {
@Override
public void configure() {
createDataset(datasetName"etlrealtimestate", KeyValueTable.class);
addStream(new Stream("hello"));
addDatasetModule("abcModule", ABCModule.class);
usePlugin("realtimesource", "kafka", "source", PluginProperties.EMPTY);
usePlugin("realtimesink", "stream", "sink", PluginProperties.EMPTY);
}
} - MapReduce:
Very similar to what Worker looks like. - Flows and Flowlets:
public static class SimpleFlow extends AbstractFlow {
public void configureFlow() {
addFlowlet("abc", new SimpleFlowlet());
addStream(new Stream("hello"));
connectStream("hello", "abc");
}
}
public static class SimpleFlowlet extends AbstractFlowlet {
public
}