Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Checklist

  • User Stories Documented
  • User Stories Reviewed
  • Design Reviewed
  • APIs reviewed
  • Release priorities assigned
  • Test cases reviewed
  • Blog post

Introduction 

 

Goals

  • Make CDAP authorization policy consistent across all entities and permissions
  • Allow setting granular permissions at dataset level, application level etc. 
  • Ranger integration for CDAP authorization
  • Improve Sentry data model to fix existing issues seen on customer environment
  • Allow admins to use existing role/groups for authorization

 

User Stories 

  • TBD

Design

CDAP Authorization Model

  • Currently read on Dataset requires permission on Namespace
    • Disadvantages: 
      • Dataset READ/WRITE require some permission on the namespace like READ. But since privileges are hierarchical this will lead to READ on every entity inside the namespace.
  • Having EXECUTE on a program does not allow user to run the program unless he has some privilege on the Application. 
    • To see the program in UI some privilege is needed on the application
  • Need for non hierarchical privileges ?
    • Managing non-hierarchical privileges can be cumbersome for admins
  • Revoke all from an entity leads to entity with no privileges leading to an unusable entity
    • What happens if the only user who has ADMIN on the entity disappears from LDAP for some reason ?
  • Updating system artifacts is not possible since only cdap has access on system namespace.
  • Define the  behavior on changing privileges
    • Existing program containers
    • New program containers
    • System container
    • Master

The existing CDAP Authorization Model has the following drawbacks:

  • Granular permissions

    • Cannot grant a privilege to a user to read only one dataset or one stream in a namespace.
    • Cannot grant a privilege to a user to deploy an application/artifact/dataset/stream without granting write on the namespace.
    • Cannot grant a privilege to a user to start/stop a program without granting READ on the namespace.
  • Visibility
    • User who has a privilege on a program cannot see the program in the UI or CLI without having any privilege on the namespace. 
  • Inconsistencies 
    • To write to a dataset user needs to have WRITE privilege on the dataset but to write to a stream user needs to have WRITE on the the stream and READ on the namespace.
    • To retrieve dataset properties READ on dataset is required whereas to read stream properties any privilege (READ/WRITE/EXECUTE/ADMIN) is sufficient.
    • ADMIN on an entity allows to delete the entity where ADMIN on entity doesn't allow to CREATE.
    • Dataset read needs namespace READ but dataset write does not need namespace WRITE.
    • TBD Dataset Module Delete All.
  • Redundancy
    • List and View operations are equivalent but are listed separately in documentation.
    • Dataset READ and Stream READ are redundant because they need Namespace READ permission to be meaningful.

CDAP Ranger Integration

CDAP Sentry Extension Improvements

  • Has no grant

 

 

Existing Roles/Groups for Authorization

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

Deprecated Programmatic APIs

New REST APIs

PathMethodDescriptionResponse CodeResponse
/v3/apps/<app-id>GETReturns the application spec for a given application

200 - On success

404 - When application is not available

500 - Any internal errors

 

     

Deprecated REST API

PathMethodDescription
/v3/apps/<app-id>GETReturns the application spec for a given application

CLI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

UI Impact or Changes

  • Impact #1
  • Impact #2
  • Impact #3

Security Impact 

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages 

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test IDTest DescriptionExpected Results
   
   
   
   

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

  • Work #1
  • Work #2
  • Work #3

 

Future work

  • No labels