Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Root TypeDescriptionsAPI Changes
DatasetConfigurer

For adding Dataset Module, Dataset Type and Dataset Instance creation at application deployment timeRemoval

  • Removla of
DatasetConfigurer,
  • DatasetModule, DatasetDefinition and related classes
    • Not supporting Dataset module and type anymore
  • The properties provided via the createDataset method will be stored inside the application specification
    • No more centralized management of dataset
DatasetContext

For instantiation of Dataset instances based on the module, type and properties

  • The original set of getDataset methods will be replaced with one that takes an explicit Type object
    • This also allow the Dataset class is defined inside plugin artifact
      Jira Legacy
      serverCask Community Issue Tracker
      serverId45b48dee-c8d6-34f0-9990-e6367dc2fe4b
      keyCDAP-3992
  • Each Dataset instance will have the DatasetContext instance injected in order to get embedding Dataset
  • Dataset properties will be coming from application specification, preferences, runtime arguments, and explicitly provided properties via API
DatasetManager

Basically a programmatic gateway to DatasetService for administrative operations

  • Create, Upgrade, Truncate, Delete, Exists
  • Metadata retrival (type, properties)
  • Removal DatasetManager and related classes
  • Dataset implementation itself is (optionally) responsible for admin operations
    • Performed directly from user program
  • Underlying resources can also be managed separately, outside of CDAP
  • Core Dataset will be methods / classes exposed for admin operations
Core Dataset Interfaces

Collections of common data usage pattern via interfaces

  • Provides abstraction over actual implementation (mainly standalone vs distributed)

Pretty much will stay the same with the following modifications

  • Expose methods / classes to perform administrative opertations
  • Have SPI contract for cloud provider implementation to provide cloud native Dataset implementation (more below)
    • e.g. Table -> BigTable, FileSet -> Blog Storage

...

Mission Control

The Mission Control runs as part of CDAP master. It consists of the Provisioner, the Program Launcher, and the Runtime Monitor. It is responsible for the whole lifecycle of a program execution. Upon receiving a start program request, it executes the program through the following steps:

...

.

For operation simplicity and scalability, the Provisioner, the Program Launcher, and the Program Runtime will be running in the same JVM process, meaning they will be scaled as one unit. In current CDAP 4, minus the provisioner, the role is mainly fulfilled by the ProgramRuntimeService, which can only runs inside the CDAP master. In order to path the way for scaling out the Mission Control in future, we should have a logical separation between the Mission Control and the CDAP master. The role of the ProgramRuntimeService will be simplified to:

  1. Loads the program runtime data from the metadata store
    • Artifact information, runtime arguments, preferences, runtime profile, etc.
  2. The ProgramRuntimeService generates Generates RunId and publish it together with the program runtime data to TMS (provisioning start program request)
    • This is essentially taking a snapshot of all information needed for the program execution

This essentially make the ProgramRuntimeService becomes a stateless library and can be used in anywhere. For example, it can be called from http handler or from scheduler, and they could be running in different processes (which is needed as we need to scale and have HA for the scheduler in future).

The Missions Control will be polling for program start message from TMS; upon receviing a new message, it will:

  1. The Provisioner picks up the provisioning start program request and starts the provision logic
  2. When provisioning completed, it notifies the Program Launcher to start the program in the cluster
    • The notification can simply done with in-memory method call, via some interface contract, assuming the Provisioner and the Program Launcher always running in the same process
    • The notification should includes all the program runtime data plus the unique identifier for the execution cluster
  3. The Program Launcher starts the Program Runtime in the given cluster
    • Different (cloud) clusters might need different ways of starting Program Runtime
  4. The Program Runtime setup the CDAP runtime environment in order to provide CDAP services to the program (more on this in later section)
  5. The Program Runtime launch the user program via ProgramRunner and manage the program lifecycle
  6. On the Mission Control side, after the Program Launcher started the Program Runtime (step 5), it notifies the Runtime Monitor to start monitoring the running program
    • Similar to above, the notification can be done with method call, providing the state has been persisted
      • If the process failed, upon restart the Runtime Monitor will just read from the state store and resumes monitoring
      • The Run Record Corrector is part of the Runtime Monitor
    • The Runtime Monitor maintains a heartbeat with the Program Runtime to get the latest states update (more on this in later section)
  7. On the completion of program execution, the Program Runtime will persist the final states into cloud storage
    1. This could be optional, depending on how much information are needed to be retained if there is failure (more on this in later section)
    2. The Runtime Monitor will fetch the final states (or already get it via the last heartbeat) and update the program states

...

Program Launcher

The Program Launcher is a relatively simple. It is responsible to launch the CDAP runtime in the cluster being provisioned by the Provisioner, which involves

  1. Gather CDAP runtime jars, program artifacts, program runtime data, configurations, etc.
  2. Launch the CDAP runtime on the target runtime cluster with the information gathered in step 1
    • This is a cluster specific step, but generally involves files localization and launching the CDAP runtime JVM on the cluster. Here are some possibilities:
      • ssh + scp files and running command
      • Talks to YARN RM directly via the RM Client protocol for localization as well as launching YARN application
      • Build and publish a docker image (localization) and interacts with Kubernetes master to launch CDAP runtime in container

Runtime Monitor

Program Runtime

...