Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

NOTE: This document also assumes that the Authorizer extension is Apache Sentry, so calls out Thrift as the communication mechanism

Program Runtime

Access datasets, streams and secure keys

During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)

Administer datasets, streams and secure keys

During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs 

Update system metadata

During program runtimes, CDAP performs various system operations for:

  • Recording Audit
  • Recording Lineage
  • Recording Usage
  • Recording Run Records
  • Namespace Lookup
  • Authorization Enforcement

Explore

Access datasets and streams

Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.

Administer datasets and streams

Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.

REST APIs

 

REST APIs

 

Publicly routed REST APIs in AppFabric Service

 

Application Deployment

 

Applications with non-existing dataset

 

  1. Client --> Router HTTP: deployApp(artifact, appConfig)
  2. Router --> AppFabric HTTP: deployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: createDataset()
  6. DatasetServiceClient --> DatasetService HTTP: createDataset(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> Authorizer Thrift: revoke(ds); grant(ds, SecurityRequestContext.userId, ALL)
  9. DatasetService --> DatasetOpExecutor HTTP: success = doAs(namespace, createDataset(ds))
  10. DatasetService --> Authorizer Thrift: !success ? revoke(ds)
  11. DatasetService --> AppFabric --> Router --> Client HTTPresult

 

Applications with existing dataset

 

  1. Client --> Router HTTPdeployApp(artifact, appConfig)
  2. Router --> AppFabric HTTPdeployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: !compatibleUpdate ? IncompatibleException
  6. DatasetServiceClient --> DatasetService HTTP: update(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> DatasetService: success = update(ds)
  9. DatasetService --> AppFabric --> Router --> Client HTTPresult

 

Applications with non-existing streams

 

Applications with existing streams

 

Namespace Creation

 

 

Namespace Deletion

 

Publicly routed REST APIs in Dataset Service

 

Create

 

  1. Client --> Router HTTPcreateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPcreateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> Authorizer Thriftrevoke(dataset); grant(dataset, SecurityRequestContext.userId, ALL)
  5. DatasetService --> DatasetOpExecutor HTTPsuccess = doAs(namespace, createDataset(dataset))
  6. DatasetService --> Authorizer Thrift!success ? revoke(dataset)
  7. DatasetService --> Router --> Client HTTPresult

 

List

 

  1. Client --> Router HTTPlistDatasets(namespace)
  2. Router --> DatasetService HTTPlistDatasets(namespace, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: result = filter(datasetsInNamespace, SecurityRequestContext.userId)
  4. DatasetService -->  Router --> Client HTTPresult

 

Get

 

  1. Client --> Router HTTPgetDataset(dataset)
  2. Router --> DatasetService HTTPdataset = getDataset(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: result = filter(dataset, SecurityRequestContext.userId)
  4. DatasetService -->  Router --> Client HTTPresult.isEmpty ? UnauthorizedException

 

Update

 

  1. Client --> Router HTTPupdateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPupdateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetService: result = update(dataset, type, properties)
  5. DatasetService --> Router --> Client HTTPresult

 

Truncate

 

  1. Client --> Router HTTPtruncate(dataset)
  2. Router --> DatasetService HTTPtruncate(ds, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, truncate(dataset))
  5. DatasetService --> Router --> Client HTTPresult

 

Drop

 

  1. Client --> Router HTTPdrop(dataset)
  2. Router --> DatasetService HTTPdrop(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, drop(dataset))
  5. DatasetService --> Authorizer Thriftrevoke(dataset)
  6. DatasetService --> Router --> Client HTTPresult

 

Upgrade

 

  1. Client --> Router HTTPupgrade(dataset)
  2. Router --> DatasetService HTTPupgrade(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, upgrade(dataset))
  5. DatasetService --> Router --> Client HTTPresult

 

Publicly routed REST APIs in Stream Service


Program Runtime

Access datasets, streams and secure keys

During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)

Administer datasets, streams and secure keys

During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs 

Update system metadata

During program runtimes, CDAP performs various system operations for:

  • Recording Audit
  • Recording Lineage
  • Recording Usage
  • Recording Run Records
  • Namespace Lookup
  • Authorization Enforcement

Explore

Access datasets and streams

Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.

Administer datasets and streams

Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.

Scratch Pad

a) Authorization

...