Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. We want the program runs we execute, datasets created during preview for preview purpose, logs and metrics emitted during preview to be isolated from the regular Standalone execution which is used to publish and run the pipeline.
  2. In Preview, pipeline could have lookup datasets in a transform which reads from the datasets in Standalone. so we want a way to share datasets in preview with datasets in standalone. 
  3. In Preview, we want to skip writing meta data and lineage information as they are unnecessary. 

Preview Injector and Services Run in Previewvs Standalone Injector:

ServiceStandalone (Yes/No)Preview (Yes/No)
userInterfaceService
YesNo
trackerAppCreationService
YesNo
router
YesNo
streamService
YesYes
exploreExecutorService
YesNo
exploreClient
YesNo
metadataService
YesNo
serviceStore (set/get service instances)
YesNo
appFabricServer
YesNo
previewServer
NoYes
datasetService
YesYes
metricsQueryService
YesYes
txService
YesYes
externalAuthenticationServer (if security enabled)
YesYes
logAppenderInitializer
YesYes
kafkaClient(if audit enabled)
YesNo
zkClient (if audit enabled)
YesNo
authorizerInstantiator (started by default)
YesYes?

 

AppFabricServer vs PreviewServer Services :

This is a subset of services started in app-fabric server.

...

Code Block
titleHybridDatasetFramework
... snippet
@Nullable
@Override
public <T extends Dataset> T getDataset(Id.DatasetInstance datasetInstanceId,
                                        @Nullable Map<String, String> arguments,
                                        @Nullable ClassLoader classLoader)
  throws DatasetManagementException, IOException {
  if (datasetInstanceId.getNamespace().equals(Id.Namespace.SYSTEM)) {
    return previewDatasetFramework.getDataset(datasetInstanceId, arguments, classLoader);
  } else {
    return standaloneDatasetFramework.getDataset(datasetInstanceId, arguments, classLoader);
  }
}

 

  

Adapting to Cluster