Sharing data between real and preview space

Goal: We want to keep the data generated by preview isolated so that it does not interfere with the data generated during normal operations. However we still want to read the actual user datasets for previewing. This document explains the pros and cons for different options about accessing and sharing data between preview spaces and real spaces. Our goal is to get the required datasets from real space for the read only purpose. For the datasets in the preview space, we want to do both reads and writes.

Option 1: Passing the list of datasets from real space to be used for readonly purpose to the preview as a configuration option.

In this option, we will have HybridDatasetFramework which will maintain a set of real dataset names passed by the application config. The call to getDataset() will be forwarded to the real dataset framework if the dataset name exists in the set, otherwise the dataset from the preview framework will be returned.

Since the HybridDatasetFramework maintains the set of dataset names, we will have to create new injector every time preview is requested, so that the dataset framework instance is created with appropriate dataset names.

public class HybridDatasetFramework implements DatasetFramework {
  private final DatasetFramework previewDatasetFramework;
  private final DatasetFramework realDatasetFramework;
  private final Set<String> datasetNames;
  // ...implementation
  public <T extends Dataset> T getDataset(String name) {
	if (datasetNames.contains(name)) {
	  // get real dataset - read only
	  return realDatasetFramework.getDataset(name);
	}
	// get preview dataset - read/write
	return previewDatasetFramework.getDataset(name);
  }
}

Pros:

  1. Does not require API change and easy to understand.
  2. Since new injector is created for every preview run, each run will itself be further isolated. This will make deletion of the preview easier since it will involve removing the directory corresponding to the preview run.

Cons:

  1. More memory will be used if multiple injectors are created.
  2. We can maintain the cache of injectors per preview run, however when the injector expires from cache, if we want to retrieve the data, we will need to create the injector again.
  3. User need to supply explicitly the set of dataset names for the preview request.

Option 2: API changes to the DatasetContext.

As a part of this option, we will add new version of the getDataset(..., Options option) method to accept the options for the dataset. So the application will be responsible to call the correct version of the getDataset method. If user wish to perform the preview of the program, then if the dataset is only going to be used for reading purpose, he should use getDataset(..., Options.READONLY). For other cases, it will call the existing getDataset(...) method.

Pros:

  1. No need to pass the list of dataset names in the config.
  2. The new api getdataset(..., Options option) can be used in future for other purposes, e.g, to help get dataset for different access type, READ, WRITE, READ_WRITE, UNKNOWN.

Cons:

  1. Application is responsible for making the correct calls.

Â