Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • CDAP system resiliency to infrastructure unavailability or interruption for long periods of time
  • CDAP Rolling Upgrades
  • CDAP Application Rolling Upgrades

...

  • Compatible Upgrade or Downgrade of underlying infrastructure Hadoop components
    • Underlying Hadoop infrastructure is either being upgraded or downgraded and the expectation is that CDAP and CDAP Applications should tolerate and be resilient to infrastructure services not being available during the upgrade or downgrade process. 
    • The upgrade or downgrade process could take anywhere between 30 mins - 18 hours or more. 
    • During the period of service unavailability or interruption, the CDAP and CDAP Applications operate in degraded mode.
    • Hadoop infrastructure upgrade / downgrade has to be compatible with CDAP and CDAP Application in order to have smooth upgrades
    • In case, there are issues during the upgrade, CDAP should be resilient to rollbacks
    • CDAP and CDAP Applications should also be able to withstand compatible downgrades
    • The compatibility matrix should be available to users to ensure smooth upgrades
  • Upgrade / Downgrade of CDAP
    • Upgrade a CDAP version. Major and minor version could have different impacts. We will discuss about these impact further in the document. 
    • Roll back of CDAP upgrade or downgrade
    • CDAP version compatible matrix available to users
    • Dry run for upgrade and downgrade
  • Upgrade / Downgrade of CDAP Applications
    • Upgrade or Downgrade a CDAP Application
    • Rolling upgrade of live services like CDAP Services, Flow and Spark Streaming

High Level Tickets

RS-001 : Un-interrupt update of compact modules in Coprocessor

CDAP system uses

...

few HBase coprocessors to optimize the operations being performed on HBase. When underlying HBase is upgraded,

...

the table has to be altered

...

. This means that

...

the

...

table has to be

...

disabled. Disabling the table can have multiple side effects on CDAP, so the recommended approach right now is to stop applications running within CDAP as well as CDAP. For each version of non-compatible HBase, CDAP has a compat module has to updated.

RS-002 : Client Resiliency

RS-003 : Move Dataset management out of CDAP Master

RS-004 : CDAP Version definition and guarantees of versions

RS-005 : Rolling upgrade definition

 RS-006 : Internal Schema Evolution and Management

RS-007 : Managing Infrastructure Incompatibility 

RS-008 : System state transition and management 

RS-009 : Apache Twill Application rolling upgrade 

RS-010 : 

  • HIGH Client Resiliency

  • Rolling Upgrade Definition
  • Internal Schema Evolution
  • Infrastructure Incompatibility 
  • State Transition and Management
  • Apache Twill Rolling Upgrade Support
  • Rolling upgrade orchestrator
  • Progressive background upgrade tool
  • User Interface / REST APIs / CLI
  • Testing Framework and Chaos monkey
  • Hydrator pipeline upgradability

...