Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Checklist

  •  User Stories Documented
  •  User Stories Reviewed
  •  Design Reviewed
  •  APIs reviewed
  •  Release priorities assigned
  •  Test cases reviewed
  •  Blog post

Introduction 

 

Goals

Clearly state the design goals/requirements for this feature 

User Stories 

  • Breakdown of User-Stories 
  • User Story #1
  • User Story #2
  • User Story #3

Design

Cover details on assumptions made, design alternatives considered, high level design

Approach

Approach #1

Runtime monitor will monitor and collect program states, metadata, lineage, workflow token..

Approaches

Approach #1

In order to collect all the monitoring data, Runtime Monitor will poll heartbeat messages from Heartbeat Handler periodically using single rest endpoint:

  1. Runtime Monitor polls for next batch of heartbeat messages along with last persisted offset for each topic
  2. Heartbeat Handler will fetch heartbeat messages from each topic (status, lineage, metadata..) using last persisted offset provided by Runtime Monitor
  3. Heartbeat Handler will gathers all the heartbeat messages and sends in a batch to Runtime Monitor along with processed offsets for each topic.
    1.  If the Runtime Monitor fails, it will start from last persisted offset for each topic and ask for heartbeat messages after that.
    2.  If the Runtime Monitor fails while it is making changes to the corresponding stores, it may reprocess some heartbeat messages depending on what last offset is.

Pros:

  • Less number of http requests.
  • Runtime Monitor will poll periodically, so having single rest endpoint helps in terms of there will be less requests to be served by web server running in Heartbeat Handler.

Cons:

  • Load balance among all the topics such that recent information needs to be provided to Runtime Monitor with very little delay. So 

Approach #2

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

Deprecated Programmatic APIs

New REST APIs

PathMethodDescriptionRequest BodyResponse CodeResponse
/v3/apps/<app-id>namespaces/{namespace}/programs/statusGETReturns the application spec list of status messages for all the programs for a given applicationnamespace

batchsize, start_offset

200 - On success404

- When application is not available204 - No content

500 - Any internal errors

 
      

Deprecated REST API

PathMethodDescription
/v3/apps/<app-id>GETReturns the application spec for a given application

CLI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

UI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages 

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results
   
   
   
   

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

  • Work #1
  • Work #2
  • Work #3

 

Future work