Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
Introduction
We want to remove the usage of upgrade tool, so that we can move towards the goal of zero/minimal down time.
Goals
For this specific work, the goal is to remove the upgrade of metadata states in the Upgrade Tool and rather move it to background threads started in the individual stores - DatasetBasedTimeSchedule, DatasetBasedStreamSizeSchedule, AppMetadataStore.
User Stories
- User A wants to upgrade from CDAP version X to Y. In this case, the user wants to experience minimal down time. Since we require that CDAP and its programs should be stopped while the upgrade tool is running, the user wants the execution of upgrade tool to take as minimal time as possible. This implies doing as minimal work as required in the upgrade tool and move the rest to CDAP services
- User B has manual replication setup from cluster A to cluster B. Now when B becomes passive and it is being upgraded, we can't start the tx manager and update the HBase table entries. This needs to be done while the cluster is active. Thus any of transactional data modification operation should happen after CDAP starts up and not in the upgrade tool
Design
Currently the Upgrade Tool performs two high level operations -
a) upgrade the coprocessors of CDAP Datasets
b) modify stream store (this will be removed since this step was present even in 3.5)
c) add app versions to three datasets - DatasetBasedTimeScheduleStore, DatasetBasedStreamSizeScheduleStore, AppMetadataDataset
Step a) is performed linearly and thus this will contribute to the upgrade tool run time proportional to the number of datasets in CDAP.
Step c) needs to be moved to their respective data stores and the upgrade tool should not be doing that operation anymore.
Approach
Parallelizing Coprocessor Upgrade:
Currently this step involves calling disableTable (sync call), changing table descriptor and enabling the table, all the tables one-by-one. This is expected to take a long time especially when there are lot of CDAP managed HBase tables. This time will add up and might exceed the upgrade time period. In order to optimize this better, we can use a Thread Pool and submit 'disable->change table descriptor->enable' jobs for each table to that executor pool to achieve parallelism for these coprocessor upgrade operation. This can minimize the amount of time taken for coprocessors upgrade step in the Upgrade Tool. The number of threads in the ThreadPool can be made configurable as well, which can be tuned as per the requirement.
Adding App Version to System Datasets using Background Upgrade Threads:
For each of the Datasets where App version needs to be added while the Stores still continue to read old data formats.
Step 1) Since we can't upgrade the datasets in the upgrade tool, we need to do it after CDAP starts up. That means the dataset store should be able to work with both the old format and the new versioned-format.
Step 2) The store will check if the app version needs to be upgraded (based on a key in the table which indicates what was the last 'CDAP' version of the dataset). If it is not the latest, then the background thread is started which will update the entries in the background.
Step 3) During normal dataset operations (for example, pause schedule or delete schedule or add schedule etc), the following things must be kept in mind:
- For Update of Record - only update the versioned entry
- For Addition of Record - only add the versioned entry
- For Deletion of Record - check both the versioned and non-versioned entry and delete them
- For List of Records - scan with and without versions, add versions for version-less scan and combine both the lists and return it
- Transactional operation should be retried if there are TransactionConflictException since we have a background thread that updates these records
Data Format:
- In DatasetBasedTimeScheduleStore, the trigger key is of the format: namespace:application:type:programname:schedulename and the job key has the format : namespace:application:type:program. We need to insert application version ('-SNAPSHOT') between the application and program type.
- In DatasetBasedStreamSizeScheduleStore, the row key is of the format: streamSizeSchedule:namespace:application:type:program:schedulename and we need to insert ('-SNAPSHOT') between the application and program type.
- In AppMetadataStore, the row key is a RunLengthEncoded value of recordType, namespace, application, programType, programName (invertedTs, runId). We need to include application version ('SNAPSHOT') in between application and program type.
Background Threads:
- Threads are started in each Store whenever it detects that the latest CDAP version doesn't match the upgraded version of the Dataset
- The logic to upgrade the entries in the dataset are already present in each store. The threads can leverage that logic.
- When the thread finds an entry to update, it should check if an entry with updated version exists in the dataset. If it does exist, then it should remove the version-less entry and not replace it (since the versioned entry could have been made by the store before the upgrade thread reached that entry).
- When all the entries have been upgraded, the thread should set the latest version of the dataset to the current version and then exit.
- We will have an entry in the table: Row : "upgrade.dataset.time.schedule", column : "version", value : "4.1.0". This entry is set to the latest version by the background thread to the latest version once it has upgraded all entries. This entry is also checked by the Store to see if the upgrade thread needs to be started (it could as simple as version not matching with latest version or if the version is not present/version less than 3.6.x, then start the background upgrade thread). This row can be expanded to have more columns that might in future give insights into the progress made with upgrade so far etc. For the REST API, the store will have method that returns version value from this row. From that info, the REST handler method can figure out whether a particular dataset upgrade is in progress etc.
API changes
New Programmatic APIs
No Programmatic API changes.
Deprecated Programmatic APIs
None
New REST APIs
URL | Description | Response |
---|---|---|
/v3/system/upgrade/status | 3.5 Installation with time and stream schedules and existing applications, run records, workflow tokens, workflow node state. Upgrade to 4.1 and verify the normal functionality of CDAP | Unknown macro: {"from" } |
Deprecated REST API
None
CLI Impact or Changes
- NA
UI Impact or Changes
- NA
Security Impact
None, since the upgrade operations will happen in AppFabric in background threads and that process already has the privileges to modify these datasets.
Impact on Infrastructure Outages
Background upgrade threads will set upgraded CDAP version only after all the upgrade is complete. Until then upgrade thread will be started by the respective stores. And the upgrade threads will retry the operations in case of errors while trying to write to HBase with a specific retry strategy.
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
1 | 3.5 Installation with time and stream schedules and existing applications, run records, workflow tokens, workflow node state. Upgrade to 4.1 and verify the normal functionality of CDAP | 4.1 should work fine with full functionality |
2 | Same test as above, scan the three stores after some time to make sure the data in those datasets have been upgraded | All the dataset entries should have app versions |
3 | 4.0.1 Installation with all the setup as step 1) | 4.1 should work fine with full functionality |
Releases
Release 4.1.1
Related Work
Future work
- Parallelize the coprocessor upgrade step until that step is still required