Page Comparison

...

We want the program runs we execute, datasets created during preview for preview purpose, logs and metrics emitted during preview to be isolated from the regular Standalone execution which is used to publish and run the pipeline.
In Preview, pipeline could have lookup datasets in a transform which reads from the datasets in Standalone. so we want a way to share datasets in preview with datasets in standalone.
In Preview, we want to skip writing meta data and lineage information as they are unnecessary.

Preview Injector and Services Run in Previewvs Standalone Injector:

Service	Standalone (Yes/No)	Preview (Yes/No)
userInterfaceService	Yes	No
trackerAppCreationService	Yes	No
router	Yes	No
streamService	Yes	Yes
exploreExecutorService	Yes	No
exploreClient	Yes	No
metadataService	Yes	No
serviceStore (set/get service instances)	Yes	No
appFabricServer	Yes	No
previewServer	No	Yes
datasetService	Yes	Yes
metricsQueryService	Yes	Yes
txService	Yes	Yes
externalAuthenticationServer (if security enabled)	Yes	Yes
logAppenderInitializer	Yes	Yes
kafkaClient(if audit enabled)	Yes	No
zkClient (if audit enabled)	Yes	No
authorizerInstantiator (started by default)	Yes	Yes?

AppFabricServer vs PreviewServer Services :

This is a subset of services started in app-fabric server.

...

Code Block

title	HybridDatasetFramework

... snippet
@Nullable
@Override
public <T extends Dataset> T getDataset(Id.DatasetInstance datasetInstanceId,
                                        @Nullable Map<String, String> arguments,
                                        @Nullable ClassLoader classLoader)
  throws DatasetManagementException, IOException {
  if (datasetInstanceId.getNamespace().equals(Id.Namespace.SYSTEM)) {
    return previewDatasetFramework.getDataset(datasetInstanceId, arguments, classLoader);
  } else {
    return standaloneDatasetFramework.getDataset(datasetInstanceId, arguments, classLoader);
  }
}

Versions Compared

Old Version 3

New Version 4

Key

Adapting to Cluster