Introduction

This is a major undertaking to improve resiliency, support rolling upgrades / downgrades, handle cross cluster or cross data center replication and also improving performance of certain components. This area will act as a central place for all the requirements gathered, design documents and other technical discussions related to this topic.

Initiative Details

Start Release : 3.6
Working Group Mailing List :
Slack Channel : tyche-group.slack.com
Approximate Timeline : Best-case : 6 months, Worst-case : 10 months
Primary Technical Contact : Terence Yim (Unlicensed)
Primary Program Contact : priyanambiar

Tracks and Execution Layout

Here we will define different tracks that are in part of this initiative and break it down into multiple releases. Please note the problems that are being solved as part of this initiative are some of the challenging and complex problems. So, they are been broken down into multiple releases for each track.

Legend

Link to design document
Associated JIRA on Cask issue or Apache project issue system

Tracks	Release 1	Release 2	Release 3	Release 4	Release 5
Status
Stage	Design & Design Reviews	Requirement Gathering	Requirement Gathering	Requirement Gathering	Requirement Gathering
Data Replication	RS-016 - Application Versioning – HBase DDL SPI – Handling Kafka offset mis-matches – Replication status tool (Design WIP) – Transactional Messaging System –	Metadata updates via Transactional Messaging Design for distributed transactions Replication tools - phase 1	Program state change updates via Transactional Messaging Distributed transactions implementation - phase 1 Replication of metadata Replication tools - phase 2	Replication of program state changes Distributed transactions implementation - phase 2 Design for namespace replication Replication tools - phase 3	Namespace replication Replication tools - phase 4
Disaster Recovery
Resilience	RS-001 - Minimize interruption caused by update of coprocessor RS-002 - Client Resiliency – RS-016 - Application Versioning – RS-004 - CDAP version definition and guarantees of version	RS-002 - Client Resiliency RS-003 - Make CDAP system services HA RS-008 - Apache Twill Application rolling upgrade RS-011 - Hydrator Pipeline upgrade RS-013 - Test framework Chaos Monkey RS-014 - User Interface/CLI	RS-010 - Progressive background upgrade tool RS-005 - Internal Schema evolution and Management RS-006 - Managing Infrastructure Incompatibility RS-012 - Dataset Upgrade RS-013 - Test framework Chaos Monkey RS-014 - User Interface/CLI	RS-009 - Upgrade Orchestrator RS-007 - System State Transition and Management RS-006 - Managing Infrastructure Incompatibility RS-012 - Dataset Upgrade RS-015 - Support for Rollbacks RS-014 - User Interface/CLI	RS-009 - Upgrade Orchestrator RS-006 - Managing Infrastructure Incompatibility RS-015 - Support for Rollbacks RS-014 - User Interface/CLI


Performance & Scalability	P&S-001 - Invalid list pruning with major compaction – P&S-002 - Tool to report progress on pruning and inspecting invalid list	P&S-003 - Performance improvement of single transaction server Optimize Read only Client Reduce lock granularity Improve group commit efficiency Performance improvement of single transaction server	P&S-004 - Transaction server per namespace for resource isolation	P&S-005 - Support for running multiple instance of transaction server in active-active mode	P&S-006 - Invalid list pruning with stripped compaction
What does this release provide ?	Dynamic loading of hbase compact modules within co-processor which will relive upgrade tool from disabling hbase tables Handling resiliency in interacting with HBase and Zookeeper Documentation describing how CDAP would version and guarantees that it would be providing in light of rolling upgrade and resiliency requirements being added to platform No more manual process to cleaning up invalid transaction list; no more manual maintenance required Improves throughput of transactional data operations Reduce network traffic; reduce transaction metadata
User Stories Enabled by the Release	Hot-Cold Replication Users should be able to set up hot - cold replication using CDAP Users will be provided a tool to get status of HBase replication to determine when it is safe to switch the cluster from cold-hot Users should be able to implement a hook to manage HBase DDL across replicated clusters CDAP will provide capabilities to automatically handle Kafka mismatches

Terminologies

Terminology	Definition
CDAP Active/Standby	CDAP is only running in one cluster (active cluster) CDAP in all other clusters shouldn't be running (standby cluster) User applications can only be running in the active cluster Data is being replicated via means outside of CDAP control HBase replication HDFS copy Kafka mirror-maker Data is available on all clusters Data can be read on Standby clusters outside of CDAP High level failover steps Stop all running apps in active cluster Stop CDAP in active cluster Wait for all data replication to settle Check via tools Pick a standby cluster to be the next active cluster and start CDAP Start applications on the new active cluster Strictly speaking, CDAP is not aware of the replication at all Since CDAP is not aware of the replication, all data needs to be replicated. Otherwise inconsistency could occur when restarting CDAP on a different cluster than the previously active one. Already doable in CDAP 3.5 CDAP 4.1 added extra API and tools to assist API for externalizing HBase table creation Tools for checking replication status
Hot/Cold replication	Same as CDAP Active/Standby
Active/Passive replication	Same as CDAP Active/Standby
CDAP Active/Active	CDAP is running in all clusters CDAP is aware of the state replications between all CDAP instances Relies on external means for data replication CDAP system tables in HBase Kafka topics for log collection User data replication is outside of CDAP control Depends on what user applications use E.g. HBase replication, HDFS copy and Kafka mirror-maker It is still active/standby from the application point of view User can declare which cluster is the active one at the namespace level User can declare which namespace needs replication and which one does not User can change the active cluster for a namespace To switch the active cluster, the following will happen Stopping all running applications in that namespace in the active cluster Pick another cluster as the new active cluster for the namespace involved Start applications in that namespace again in the new active cluster CDAP will provide easy switch to perform the three steps described above on behalf of the user
Hot/Hot replication	Same as CDAP Active/Active
CDAP Active/Active with Application Master/Slaves	Have everything described in CDAP Active/Active User data is still being replicated via means outside of CDAP control For a namespace The active cluster for that namespace is the "master" Writes only happen in the master cluster All other clusters are the "slave" clusters Receives user data updates from the "master" cluster via replication Can start applications in that namespace for "read-only" operation The same application that is already running in the "master" cluster cannot be started on any "slave" clusters

Reference Documents

Functional Components and their impacts – here
RAFT Protocol – here
Cloudera Software Integration Guide – here

Resiliency, Replication, Rolling Upgrade, Performance

Introduction

Initiative Details

Tracks and Execution Layout

Terminologies