Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Checklist

  • User Stories Documented
  • User Stories Reviewed
  • Design Reviewed
  • APIs reviewed
  • Release priorities assigned
  • Test cases reviewed
  • Blog post

Introduction 

Move CDAP Applications Extensions such as Cask-Tracker and Wrangler to CDAP System namespace.

Goals

Currently the CDAP Application for extensions such Cask Tracker, Wrangler are created and run in the namespace it is enabled in. Since

Error rendering macro 'jira' : Unable to locate Jira server for this macro. It may be due to Application Link configuration.
supports applications in system namespace. we want to move these extensions to system namespace and be able to support request from all the user namespaces. This reduces the overhead and resource footprint for the extensions.

User Stories 

  • Breakdown of User-Stories 
  • User Story #1
  • User Story #2
  • User Story #3

Design

Cover details on assumptions made, design alternatives considered, high level design

Approach

Moving Cask-Tracker to system namespace.

 Tracker app contains two programs

  • AuditLogFlow
    • AuditLogConsumerFlowlet - Read from TMS audit topic, emit payload string to next stage. read offset is persisted to a key-value table (default : _auditOffset)
      • Note : the flowlet by default is configured to read from default namespace and "audit" topic, however audit messages are published to "system" namespace by CDAP. Need to verify if this was ever useful.
    • AuditLogPublisher - Deserialize the payload and get "AuditMessage" (currently filters on current namespace, will have to skip that check) and writes the audit message to "AuditLogTable" - custom dataset, "AuditMetricsCube" - backed by cube dataset, "LatestEntityTable" - custom dataset

  • TrackerService
    • AuditLogHandler - Single endpoint "query" - scans "AuditLogTable" based on query params and responds.
    • AuditMetricsHandler - uses "AuditMetricsCube" to handle queries for "Top N Applications/Datasets/Programs" and "Histogram". uses "LatestEntityTable" for time-since endpoint (need to look into what that means)
    • TrackerMeterHandler -  uses "AuditMetricsCube" and "LatestEntityTable" for truth meter score 
    • AuditTagsHandler - uses "AuditTagsTable" to store tags, promote/demote tags based on REST endpoints, for some of the REST service methods, it also talks with the metadata-service using zookeeper-service discovery directly for getting metadata. 
    • DataDictionaryHandler - similar to above but uses "dataDictionary" table

    • ConfigurationHandler - Storing, retrieving and deleting config using a ConfigTable (Key-value-Table).


  Datasets 

   Tracker app creates/uses 5 datasets 

  • AuditLogTable
  • AuditMetricsCube  
  • LatestEntityTable

  • AuditTagsTable
  • DataDictionaryTable
  • ConfigTable

Approach #1

Approach #2

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

Deprecated Programmatic APIs

New REST APIs

PathMethodDescriptionResponse CodeResponse
/v3/apps/<app-id>GETReturns the application spec for a given application

200 - On success

404 - When application is not available

500 - Any internal errors







Deprecated REST API

PathMethodDescription
/v3/apps/<app-id>GETReturns the application spec for a given application

CLI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

UI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages 

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results












Releases

Release X.Y.Z

Release X.Y.Z

Related Work

  • Work #1
  • Work #2
  • Work #3


Future work

  • No labels