Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Contents

Table of Contents
maxLevel2

...

  •  User stories documented(Andreas)
  •  User stories reviewed(Nitin)
  •  User stories reviewed(Todd)
  •  Requirements documented(Andreas)
  •  Requirements Reviewed
  •  Mockups Built
  •  Design Built
  •  Design Accepted

...

  • Deploying a dataset type (or module) is implemented as deployment of an artifact with version "embedded"
    • what does this mean for configuration, recording of dependencies? 
  • Version "embedded" is treated like a snapshot version, that is, it can be redeployed any time. For now this is the only version we use. 
  • Creating a dataset instance tags that instance with version "embedded"
  • New implementation of dataset framework (and actually, dataset instantiatorfor explore only) that loads the code from the artifact repo. 
  • In programs, since the only version is "embedded", dataset code is still loaded using the program class loader.
  • No introduction of new or versioned APIs

...

  1. Simplification of explore configuration
    1. Whether explore is enabled is explicit property
    2. All other explore properties derived from dataset properties if possible 
  2. Explore failure also fails the DTM operation that called it
  3. Ability to communicate warnings to the user for successful explore operations
  4. Enable/Disable explore as dataset management operations 

Proposed Scope for 4.0

  1. Minimal work to remove artifact management from DatasetTypeManager
    1. Remove the (experimental) REST API to deploy a dataset module by itself
    2. For dataset types/modules deployed from an app, remove the generation of an artifact. Instead record the app artifact that is was created from
    3. Similar as b. for dataset types included in plugins
    4. For apps, load dataset types from program class loader. For explore, load from the artifact recorded for the type
    5. May require some changes in artifact repository
  2. Simplify configuration of datasets
    1. Schema and format as a system properties with validation
    2. TTL as a system property
  3. New API for a dataset type to declare what configuration it accepts (needed for Resource Center)
    1. Properties (instance configuration)
    2. Arguments (runtime configuration)
  4. Make dataset lifecycle methods (create, update, drop) consistent
    1. In case of failure, do not leave partial/inconsistent state behind
    2. Do not silently ignore explore failures: they must fail the entire operation
  5. Simplify configuration of explore properties CDAP-2790 
    1. Derived all explore properties from schema+format when possible. 
    2. Allow configuring the detailed explore properties (as today) for power users.
  6. Improved control over transactions for programs CDAP-7319
    1. Configure transaction timeout as a runtime argument / preference at namespace, app, program level CDAP-6103
    2. Programmatic APIs for programs that allow executing a transaction with custom timeout CDAP-7193CDAP-7320CDAP-7322
    3. Add a way to access datasets (and call non-transactional methods) CDAP-7323
    4. Fix the transactional behavior of WorkerContext.execute() CDAP-6837