RS-001 Coprocessor Rolling Upgrade
Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
IntroductionÂ
One of the reasons CDAP must be stopped before an upgrade is so that the upgrade tool can be run to update the coprocessors for all CDAP tables. In order to minimize downtime, we would like to be able to upgrade coprocessors in a rolling fashion.
Goals
Design a method to upgrade CDAP HBase coprocessors in a rolling fashion, with minimal downtime.
User StoriesÂ
- As a cluster administrator, I want to be able to upgrade CDAP coprocessors without stopping CDAP
- As a cluster administrator, I want to be able to upgrade HBase without stopping CDAP
Design
Prior to 4.1.0, the way coprocessors are handled is that they are built and loaded onto hdfs when the dataset is created. When the HBase Table is created, it is configured with the hdfs path of the coprocessor(s), the classname of the coprocessor(s), and the priority. During a CDAP upgrade, CDAP is stopped, and an upgrade tool is run that loops through all tables, disables the table, builds and uploads the new coprocessor jars, modifies the table to point to the new coprocessor(s) on hdfs, then re-enables the table. This is nice in that CDAP manages coprocessors itself and cluster administrators don't need to know anything about coprocessors. It is not ideal in that it requires downtime in order to upgrade the coprocessor.Â
Approach
We first describe the approach for CDAP rolling upgrade, assuming that no HBase upgrade is happening.
Rolling CDAP upgrade
We will change the coprocessors used by Tables to be wrappers that lookup the cdap version, download the relevant coprocessor jar from hdfs, instantiate the relevant class, then delegate all calls to the instantiated class. That give more detail, on startup, CDAP will load all required coprocessors to predetermined locations on hdfs:
/cdap/lib/coprocessors/table-<cdap-version>-<hbase-version>.jar
for example, the actual coprocessor implementation will be placed on hdfs at:
/cdap/lib/coprocessors/table-4.1.0-1.1.0.jar
/cdap/lib/coprocessors/table-4.1.1-1.1.0.jar
/cdap/lib/coprocessors/table-4.1.2-1.1.0.jar
The wrapper coprocessor will also be placed on hdfs, but the same jar can be used for all versions of CDAP:
/cdap/lib/coprocessors/base-1.1.0.jar
The wrapper coprocessor will be the one that each hbase table will be configured to use. When it starts up, it will read the CDAP version from a predefined table, download the required coprocessor jar, create a classloader from it, and instantiate the actual coprocessor class. This change is completely transparent to cdap users and cluster administrators.Â
Rolling HBase upgrade
Rolling HBase upgrade will be considered an advanced configuration that requires additional work from the cluster administartor. We will add a configuration setting 'master.manage.coprocessors' that defaults to 'true'. When true, CDAP handles coprocessors the same as before and cluster administrators don't have to do any additional work. However, it also means there will be downtime when upgrading CDAP or HBase. When set to false, when CDAP creates HBase Table, it will only specify the wrapper coprocessor classname and priority, but not the hdfs path. Instead of placing the wrapper coprocessor jar on hdfs, the CDAP wrapper coprocessor jar must be installed on every HBase node and included in the HBase classpath.Â
In order to upgrade HBase in a rolling fashion, cluster administrators must install the new CDAP wrapper coprocessor on the node to be upgraded and restart the regionserver.Â
Both
Since the change to support rolling cdap upgrades is internal to cdap, the work to support both is the same as the work to support just rolling HBase upgrade.
API changes
No changes to programmatic APIs
New REST APIs
No REST API changes
CLI Impact or Changes
- None
UI Impact or Changes
- None
Security ImpactÂ
What's the impact on Authorization and how does the design take care of this aspect
Impact on Infrastructure OutagesÂ
System behavior (if applicable - document impact on downstream [ YARN, HBase etc ]Â component failures) and how does the design take care of these aspect
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
1 | Run an app that uses all coprocessor features (readless increments, etc) on CDAP 3.5.2. Perform a rolling upgrade without stopping the app. | Table contents are as expected |
2 | Run an app that uses all coprocess features on CDAP 4.1.0. Perform a rolling upgrade of HBase to another supported version without stopping the app. | Table contents are as expected |
 |  |  |
 |  |  |
Releases
Release 4.1.0
Related Work
- Work #1
- Work #2
- Work #3