System Bootstrapping
Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
Introduction
CDAP allows you to create and manage many entities at the system level. There are system profiles, system preferences, system artifacts and system apps. These are all managed in different ways. Profiles and preferences must be manually created and managed through the REST endpoints once CDAP is up and running. Artifacts are automatically loaded from a directory every time CDAP starts up. System apps are sometimes created manually, and sometimes automatically created by the backend on startup. Any additional instance specific bootstrapping must be handled by the CDAP administrator. For example, if a CDAP admin wants to make sure the default amount of memory used by program containers is 4gb, they need to manually set a system preference after CDAP has been installed and is running. In environments where multiple CDAP instances must be installed and bootstrapped, it is up to the administrator to before the manual steps required to set everything up in a consistent fashion.
Goals
Provide a unified mechanism used to bootstrap a CDAP instance.
Use Cases
- An organization gives every developer their own CDAP sandbox for development. The company hosts much of their development infrastructure and data in the cloud. The administrator wants to ensure that every developer's CDAP instance is pre-configured with a cloud runtime profile as the system default instead of the native profile. They also want to make sure the dataprep app is deployed with a specific config such that the cloud connections are pre-defined.
- An organization runs many clusters for different use cases, each cluster with their own CDAP instance. A new cluster is created every quarter. System administrators want to ensure that every new CDAP instance comes installed with a set of required system applications that run worker and service programs that ensure certain compliance requirements are enforced on each cluster.
User Stories
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system profiles
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system preferences
- As a System Administrator, I want new CDAP instances to bootstrap themselves with pre-configured system applications
- As a System Administrator, I want new CDAP instances to automatically start system programs
- As a System Administrator, I want certain bootstrap actions to only be run once on a new CDAP instance
- As a System Administrator, I want certain bootstrap actions to be run every time CDAP is restarted
- As a System Administrator, if CDAP dies or is shut down before bootstrap can finish, I want bootstrap to be retried the next time CDAP starts
- As a System Administrator, I want to be able to manually re-run bootstrap operations at a later time
- As a System Administrator, I don't want a manual bootstrap operation to modify any existing CDAP entities
- As a System Administrator, I want to see log warnings when a bootstrap step fails
- As a System Administrator, I want to see an informational log message when a bootstrap step is skipped because an entity already exists
- As a System Administrator, I don't want a failed bootstrap step to block subsequent steps from running
- As a CDAP user, I want to be able to use CDAP normally even if the bootstrap fails
- As a CDAP user, I want certain system programs (like dataprep) to automatically start whenever the CDAP sandbox starts
Design
We will add a new 'system.bootstrap.file' configuration setting. This points to a file that will specify what entities need to be bootstrapped.
Some bootstrap steps will only automatically happen once per CDAP instance. We will need to keep track of whether or not the instance has been bootstrapped by setting some value in a system table. The CDAP master leader will perform the bootstrap operations.
Manual re-runs
Bootstrap can be re-run by calling a new REST endpoint 'POST /v3/bootstrap'. When bootstrap is re-run, entities will only be created if they do not already exist. For example, if the bootstrap file contains a step to create a profile named 'ABC' and there is an existing profile named 'ABC', the bootstrap process will ignore that step. Existing entities will not be modified either.
Failure scenarios
If CDAP dies or is shut down in the middle of a bootstrap, the bootstrap will be retried the next time CDAP starts up. Conflicts will be handled the same way as manual re-runs. If a bootstrap step is on an entity that already exists, the step will be skipped. Existing entities will not be modified.
If a bootstrap step fails with a non-retryable exception, it will be skipped and the next step will be executed.
Bootstrap File
There are many ways to represent the bootstrap file. JSON format seems like a consistent choice for a format, as many of our other config files are in JSON.
The bootstrap file is a list of steps, where each step has the same format. This makes the ordering unambiguous and gives a mostly standard format for each step. Each step will have some short label, which will be used when logging warnings that a step failed. Each step also has a type, that determines what type of action will be performed. Each step also defines whether or not it should be run every time CDAP starts up, or just once for the entire CDAP instance.
{ "steps": [ { "label": "Load system artifacts", "type": "LOAD_SYSTEM_ARTIFACTS", "runCondition": "ALWAYS" }, { "label": "Create ABC profile", "type": "CREATE_PROFILE", "runCondition": "ONCE", "arguments": { "name": "ABC", "provisioner": { "name": "gcp-dataproc", "properties": { "accountKey": "${dataproc-key}", ... } } } }, { "label": "Set ABC as default system profile", "type": "SET_SYSTEM_PROPERTIES", "runCondition": "ONCE", "arguments": { "properties": { "system.profile.name": "ABC" } } }, { "label": "Create DataPrep with Default Connections", "type": "CREATE_APPLICATION", "runCondition": "ONCE", "arguments": { "namespace": "default", "name": "dataprep", "artifact": { "name": "wrangler-service", "version": "[3.0.0,4.0.0)", "scope": "SYSTEM" }, "config": { "connections": [ { "name": "Company Cloud Storage", "type": "GCS", "properties": { ... } } ] } } }, { "label": "Start DataPrep Service", "type": "START_PROGRAM", "runCondition": "ALWAYS", "arguments": { "namespace": "default" "application": "dataprep", "type": "service", "program": "service" } } ] }
The example above loads system artifacts, creates a profile named 'ABC', sets 'ABC' as the default profile for all of CDAP, creates the dataprep application configured with a default cloud storage connection, and finally starts the dataprep service if it is not already running.
The steps to load system artifacts and start the dataprep service are performed every time CDAP starts up. Profile creation, setting as the default profile, and creating the dataprep app only happens once for a CDAP instance.
The bootstrap file can be represented in Java as.
public class BootstrapConfig { private final List<BootstrapStep> steps; } public class BootstrapStep { private final String label; private final Type type; private final RunCondition runCondition; private final JsonObject arguments; public static enum Type { LOAD_SYSTEM_ARTIFACTS, CREATE_PROFILE, SET_SYSTEM_PROPERTIES, CREATE_APPLICATION, START_PROGRAM; } public static enum RunCondition { ONCE, // only automatically run this once ALWAYS; // automatically run this on every CDAP start } }
API changes
New Programmatic APIs
None
Deprecated Programmatic APIs
None
New REST APIs
Path | Method | Description | Response Code | Response |
---|---|---|---|---|
/v3/bootstrap | POST | Re-run the bootstrap steps | 200 - Ran steps, even if all failed 500 - Any internal errors so didn't run steps 400 - bootstrap file is ill formed | { "steps": [ { "label": "...", "status": "SUCCEEDED" | "FAILED", "message": "error message" } ] } |
Deprecated REST API
None
CLI Impact or Changes
None
UI Impact or Changes
None
Security Impact
What's the impact on Authorization and how does the design take care of this aspect
Impact on Infrastructure Outages
None
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
1 | Move existing logic that loads system artifacts and deploys apps to bootstrap framework | System artifacts loaded on restart, wrangler automatically deployed and started for sandbox on every restart. |
2 | Create bootstrap file that creates a profile and sets it as default profile | After starting CDAP, the profile should appear in the system profile list as the default profile |
3 | Create bootstrap file with 3 steps where the second step is guaranteed to fail | Steps 1 and 3 should complete and there should be a warning about step 2 |
4 | Manually create a profile that has the same name as a bootstrap profile, but with different properties | Bootstrap step should be skipped |
5 | Manually set preferences that have the same keys but different values as those in the bootstrap step | Bootstrap step should not overwrite existing preferences |
Releases
Release 5.1.0
Related Work
None
Future work
Potentially add more supported actions