Security-Impersonation-Namespace Mapping Scenarios

Overview

This page documents various scenarios for security use cases supported in 3.5. The scenarios below apply to the following combinations of security:

  1. Authorization
  2. Authorization + Namespace Mapping
  3. Authorization + Impersonation
  4. Authorization + Impersonation + Namespace mapping

NOTE: In this document,

EntityA --> EntityB indicates a call (method call or RPC) from EntityA to EntityB

Monospace indicates an operation (either method call or RPC)

Bold superscript indicates RPC transport

Bold blue indicates a userId is being set, or read

Bold green indicates impersonation

Bold red indicates an exit with failure

NOTE: This document also assumes that the Authorizer extension is Apache Sentry, so calls out Thrift as the communication mechanism

 

REST APIs

Publicly routed REST APIs in AppFabric Service

Application Deployment

Applications with non-existing dataset

  1. Client --> Router HTTPdeployApp(artifact, appConfig)
  2. Router --> AppFabric HTTPdeployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: createDataset()
  6. DatasetServiceClient --> DatasetService HTTP: createDataset(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> Authorizer Thriftrevoke(ds); grant(ds, SecurityRequestContext.userId, ALL)
  9. DatasetService --> DatasetOpExecutor HTTPsuccess = doAs(namespace, createDataset(ds))
  10. DatasetService --> Authorizer Thrift!success ? revoke(ds)
  11. DatasetService --> AppFabric --> Router --> Client HTTPresult

Applications with existing dataset

  1. Client --> Router HTTPdeployApp(artifact, appConfig)
  2. Router --> AppFabric HTTPdeployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: !compatibleUpdate ? IncompatibleException
  6. DatasetServiceClient --> DatasetService HTTP: update(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> DatasetService: success = update(ds)
  9. DatasetService --> AppFabric --> Router --> Client HTTPresult

Applications with non-existing streams

Applications with existing streams

Namespace Creation

 

  1. Client --> Router HTTP: createNamespace(nsName, nsConfig)
  2. Router --> AppFabric HTTP: createNamespace(nsName, nsConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> Authorizer Thrift: grant(namespace, SecurityRequestContext.userId, ALL)
  5. AppFabric --> DatasetServiceClient: getDataset(app.meta)
  6. DatasetServiceClient --> DatasetService HTTP: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))
  7. DatasetService --> AuthEnforcerresult = filter(dataset, SecurityRequestContext.userId)        (info) This will always be non-empty, because of the system principal
  8. DatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
  9. AppFabric --> MDS: store(namespace)
  10. AppFabric --> StorageProviderNsAdmin: result = doAs(nsName, createNamespace(namespaceMeta))     (info) This will only check for access for custom mappings, but will create otherwise
  11. AppFabric —> AppFabric: !result ? revoke(namespace) && NamespaceCannotBeCreatedException 
  12. AppFabric --> Router --> Client HTTPresult

Namespace Deletion

  1. Client --> Router HTTPdeleteNamespace(nsName)
  2. Router --> AppFabric HTTP: deleteNamespace(nsName, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> Authorizer Thriftrevoke(namespace, SecurityRequestContext.userId, ALL)
  5. AppFabric --> DatasetServiceClient: getDataset(app.meta)
  6. DatasetServiceClient --> DatasetService HTTP: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))
  7. DatasetService --> AuthEnforcer: result = filter(dataset, SecurityRequestContext.userId)        (info) This will always be non-empty, because of the system principal
  8. DatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
  9. AppFabric --> MDS: delete(namespace)
  10. AppFabric --> StorageProviderNsAdmin: result = doAs(nsName, delete(namespaceMeta))              (info) This will only check for access for custom mappings, but will delete otherwise
  11. AppFabric --> Authorizer Thriftrevoke(namespace, SecurityRequestContext.userId, ALL)
  12. AppFabric —> AppFabric: !result ? NamespaceCannotBeDeletedException 
  13. AppFabric --> Router --> Client HTTPresult

Publicly routed REST APIs in Dataset Service

Create

  1. Client --> Router HTTPcreateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPcreateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> Authorizer Thriftrevoke(dataset); grant(dataset, SecurityRequestContext.userId, ALL)
  5. DatasetService --> DatasetOpExecutor HTTPsuccess = doAs(namespace, createDataset(dataset))
  6. DatasetService --> Authorizer Thrift!success ? revoke(dataset)
  7. DatasetService --> Router --> Client HTTPresult

List

  1. Client --> Router HTTPlistDatasets(namespace)
  2. Router --> DatasetService HTTPlistDatasets(namespace, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: result = filter(datasetsInNamespace, SecurityRequestContext.userId)
  4. DatasetService -->  Router --> Client HTTPresult

Get

  1. Client --> Router HTTPgetDataset(dataset)
  2. Router --> DatasetService HTTPdataset = getDataset(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: result = filter(dataset, SecurityRequestContext.userId)
  4. DatasetService -->  Router --> Client HTTPresult.isEmpty ? UnauthorizedException

Update

  1. Client --> Router HTTPupdateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPupdateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetService: result = update(dataset, type, properties)
  5. DatasetService --> Router --> Client HTTPresult

Truncate

  1. Client --> Router HTTPtruncate(dataset)
  2. Router --> DatasetService HTTPtruncate(ds, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, truncate(dataset))
  5. DatasetService --> Router --> Client HTTPresult

Drop

  1. Client --> Router HTTPdrop(dataset)
  2. Router --> DatasetService HTTPdrop(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, drop(dataset))
  5. DatasetService --> Authorizer Thriftrevoke(dataset)
  6. DatasetService --> Router --> Client HTTPresult

Upgrade

  1. Client --> Router HTTPupgrade(dataset)
  2. Router --> DatasetService HTTPupgrade(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, upgrade(dataset))
  5. DatasetService --> Router --> Client HTTPresult

Publicly routed REST APIs in Stream Service

Program Runtime

Access datasets, streams and secure keys

During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)

Administer datasets, streams and secure keys

During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs 

Update system metadata

During program runtimes, CDAP performs various system operations for:

  • Recording Audit
  • Recording Lineage
  • Recording Usage
  • Recording Run Records
  • Namespace Lookup
  • Authorization Enforcement

Explore

Access datasets and streams

Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.

Administer datasets and streams

Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.

Authorization Cache Updates

Scratch Pad

a) Authorization

b) Auth + NS

c) Auth + Impersonation

d) Auth + Impersonation +NS


Application deploy -> Create DS and Streams


2. Program Run -> Creating DS and Streams

3. program Run -> Access DS and Streams

4. Explore -> Access Dataset (Explore can insert to DS)  INSERT on SELECT

5. REST APIS -> Create DS and Streams

6. REST APIS -> Access DS and Streams

7. Program -> Access System DS for System metadata recording

Replace Create with Create, Delete and Truncate. All of the admin ops should be accounted

8. Create namespace

9. Delete namespace