Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. When CDAP standalone is started, it will start PreviewService (which has PreviewHttpHandler) along with other required services. When CDAP shuts down, PreviewService will be terminated.
  2. No-op implementation of the PreviewContext will be injected into SDK and BasicPreviewContext will be injected in preview. 
  3. DatasetFramework and DiscoveryService from SDK will be used by Preview. 
    1. DiscoveryService will be used for registering preview service so that it can be discovered by router.
    2. DatasetFramework will be used for accessing the datasets in the user namespace. 
  4. User will give the preview request using preview REST endpoint.
  5. We will have rule in the cdap-router which will forward the preview requests to the PreviewHttpHandler.
  6. PreviewHttpHandler will receive request with preview configurations and generate unique preview id for it which will be used as app id.
  7. When the app gets configured during LocalArtifactLoaderStage, application will replace the config object with the object updated for preview

    Code Block
    public class DataPipelineApp extends AbstractApplication<ETLBatchConfig> {
    	public void configure() {
      		ETLBatchConfig config = getConfig();
    		if (config.isPreviewMode()) {
    			// This method should be responsible to create new pipeline configuration for example: replacing source with mock source
    			config = config.getConfigForPreview();
    		}
     
    		PipelineSpecGenerator<ETLBatchConfig, BatchPipelineSpec> specGenerator = new BatchPipelineSpecGenerator(...);
    		BatchPipelineSpec spec = specGenerator.generateSpec(config);
    		PipelinePlanner planner = new PipelinePlanner(...);
    		PipelinePlan plan = planner.plan(spec);
    
    		addWorkflow(new SmartWorkflow(spec, plan, getConfigurer(), config.getEngine()));
    		scheduleWorkflow(...);
    	}   
    }
  8. There will be inconsistency between application JSON configurations and programs created for the applications. Since we create programs once the app configurations are updated with the preview configs. - OPEN QUESTION

  9. Preview application deployment pipeline

    Stage NameRegular ApplicationPreview Application
    LocalArtifactLoaderStageYesYes
    ApplicationVerificationStageYesYes
    DeployDatasetModulesStageYesNo
    CreateDatasetInstanceStageYesNo
    CreateStreamsStageYesNo
    DeleteProgramHandlerStageYesNo
    ProgramGenerationStageYesYes
    ApplicationRegistrationStageYesYes
    CreateSchedulesStageYesNo
    SystemMetadataWriterStageYesNo
  10. If there is a failure in the deploy pipeline, PreviewHttpHandler will return 500 status code with deploy failure reason.

  11. Once deployment is successful, preview handler will start the program and return preview id as response. Currently we will start the SmartWorkflow in DataPipelineApp however preview configurations can be extended to accept the program type and program name to start.
  12. During runtime when program emits preview data using PreviewContext, the implementation of it (BasicPreviewContext) will write that data to PreviewStore.
  13. PreviewStore can store data in memory. It cannot be Table dataset because we want the intermediate data even if the transaction failed. Also it cannot be Fileset dataset, because if MapReduce program fails then it cleans up the files. (Potentially we can use non-transactional table like Metrics).
  14. Logs TBD
  15. Metrics for preview will be stored in the Metric dataset created for preview.