...
- When CDAP standalone is started, it will start PreviewService (which has PreviewHttpHandler) along with other required services. When CDAP shuts down, PreviewService will be terminated.
- No-op implementation of the PreviewContext will be injected into SDK and BasicPreviewContext will be injected in preview.
- DatasetFramework and DiscoveryService from SDK will be used by Preview.
- DiscoveryService will be used for registering preview service so that it can be discovered by router.
- DatasetFramework will be used for accessing the datasets in the user namespace.
- User will give the preview request using preview REST endpoint.
- We will have rule in the cdap-router which will forward the preview requests to the PreviewHttpHandler.
- PreviewHttpHandler will receive request with preview configurations and generate unique preview id for it which will be used as app id.
When the app gets configured during LocalArtifactLoaderStage, application will replace the config object with the object updated for preview
Code Block public class DataPipelineApp extends AbstractApplication<ETLBatchConfig> { public void configure() { ETLBatchConfig config = getConfig(); if (config.isPreviewMode()) { // This method should be responsible to create new pipeline configuration for example: replacing source with mock source config = config.getConfigForPreview(); } PipelineSpecGenerator<ETLBatchConfig, BatchPipelineSpec> specGenerator = new BatchPipelineSpecGenerator(...); BatchPipelineSpec spec = specGenerator.generateSpec(config); PipelinePlanner planner = new PipelinePlanner(...); PipelinePlan plan = planner.plan(spec); addWorkflow(new SmartWorkflow(spec, plan, getConfigurer(), config.getEngine())); scheduleWorkflow(...); } }
There will be inconsistency between application JSON configurations and programs created for the applications. Since we create programs once the app configurations are updated with the preview configs. - OPEN QUESTION
Preview application deployment pipeline
Stage Name Regular Application Preview Application LocalArtifactLoaderStage Yes Yes ApplicationVerificationStage Yes Yes DeployDatasetModulesStage Yes No CreateDatasetInstanceStage Yes No CreateStreamsStage Yes No DeleteProgramHandlerStage Yes No ProgramGenerationStage Yes Yes ApplicationRegistrationStage Yes Yes CreateSchedulesStage Yes No SystemMetadataWriterStage Yes No If there is a failure in the deploy pipeline, PreviewHttpHandler will return 500 status code with deploy failure reason.
- Once deployment is successful, preview handler will start the program and return preview id as response. Currently we will start the SmartWorkflow in DataPipelineApp however preview configurations can be extended to accept the program type and program name to start.
- During runtime when program emits preview data using PreviewContext, the implementation of it (BasicPreviewContext) will write that data to PreviewStore.
- PreviewStore can store data in memory. It cannot be Table dataset because we want the intermediate data even if the transaction failed. Also it cannot be Fileset dataset, because if MapReduce program fails then it cleans up the files. (Potentially we can use non-transactional table like Metrics).
- Logs TBD
- Metrics for preview will be stored in the Metric dataset created for preview.
- Deletion of the preview data: We can maintain the LRU cache of the preview data for different preview ids. In 3.5 we can restrict the LRU cache size to be 1.
- Get preview data: PreviewManager will be used by PreviewHttpHandler to query for preview data from preview store, logs and metrics.
- PreviewStore: PreviewStore will be responsible for storing the preview data. Implementation of PreviewStore will store the data in memory for 3.5. In future we can think of storing it in Level db dataset.
Implementation Plan:
1) Preview Service
...