Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

This page documents various scenarios for security use cases supported in 3.5. The scenarios below apply to the following combinations of security:

...

NOTE: This document also assumes that the Authorizer extension is Apache Sentry, so calls out Thrift as the communication mechanism

Program Runtime

Access datasets, streams and secure keys

During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)

Administer datasets, streams and secure keys

During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs 

Update system metadata

During program runtimes, CDAP performs various system operations for:

  • Recording Audit
  • Recording Lineage
  • Recording Usage
  • Recording Run Records
  • Namespace Lookup
  • Authorization Enforcement

Explore

Access datasets and streams

Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.

Administer datasets and streams

Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.

 

REST APIs

Publicly routed REST APIs in AppFabric Service

...

Applications with non-existing dataset

  1. Client --> Router HTTP: deployApp(artifact, appConfig)
  2. Router --> AppFabric HTTP: deployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: createDataset()
  6. DatasetServiceClient --> DatasetService HTTP: createDataset(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> Authorizer Thrift: revoke(ds); grant(ds, SecurityRequestContext.userId, ALL)
  9. DatasetService --> DatasetOpExecutor HTTP: success = doAs(namespace, createDataset(ds))
  10. DatasetService --> Authorizer Thrift: !success ? revoke(ds)
  11. DatasetService --> AppFabric --> Router --> Client HTTPresult

...

  1. Client --> Router HTTPdeployApp(artifact, appConfig)
  2. Router --> AppFabric HTTPdeployApp(artifact, appConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> AppFabric: doAs(namespace, deploy(jar, config))
  5. AppFabric --> DatasetServiceClient: !compatibleUpdate ? IncompatibleException
  6. DatasetServiceClient --> DatasetService HTTP: update(ds, Header(CDAP-UserId=SecurityRequestContext.userId))
  7. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  8. DatasetService --> DatasetService: success = update(ds)
  9. DatasetService --> AppFabric --> Router --> Client HTTPresult

...

Applications with existing streams

Namespace Creation

 

  1. Client --> Router HTTP: createNamespace(nsName, nsConfig)
  2. Router --> AppFabric HTTP: createNamespace(nsName, nsConfig, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> Authorizer Thrift: grant(namespace, SecurityRequestContext.userId, ALL)
  5. AppFabric --> DatasetServiceClient: getDataset(app.meta)
  6. DatasetServiceClient --> DatasetService HTTP: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))
  7. DatasetService --> AuthEnforcerresult = filter(dataset, SecurityRequestContext.userId)        (info) This will always be non-empty, because of the system principal
  8. DatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
  9. AppFabric --> MDS: store(namespace)
  10. AppFabric --> StorageProviderNsAdmin: result = doAs(nsName, createNamespace(namespaceMeta))     (info) This will only check for access for custom mappings, but will create otherwise
  11. AppFabric —> AppFabric: !result ? revoke(namespace) && NamespaceCannotBeCreatedException 
  12. AppFabric --> Router --> Client HTTPresult

Namespace Deletion

  1. Client --> Router HTTPdeleteNamespace(nsName)
  2. Router --> AppFabric HTTP: deleteNamespace(nsName, SecurityRequestContext.userId)
  3. AppFabric --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. AppFabric --> Authorizer Thriftrevoke(namespace, SecurityRequestContext.userId, ALL)
  5. AppFabric --> DatasetServiceClient: getDataset(app.meta)
  6. DatasetServiceClient --> DatasetService HTTP: getDataset(app.meta, Header(CDAP-UserId=Principal.SYSTEM))
  7. DatasetService --> AuthEnforcer: result = filter(dataset, SecurityRequestContext.userId)        (info) This will always be non-empty, because of the system principal
  8. DatasetService --> DatasetServiceClient HTTP —> AppFabric: MDS
  9. AppFabric --> MDS: delete(namespace)
  10. AppFabric --> StorageProviderNsAdmin: result = doAs(nsName, delete(namespaceMeta))              (info) This will only check for access for custom mappings, but will delete otherwise
  11. AppFabric --> Authorizer Thriftrevoke(namespace, SecurityRequestContext.userId, ALL)
  12. AppFabric —> AppFabric: !result ? NamespaceCannotBeDeletedException 
  13. AppFabric --> Router --> Client HTTPresult

Publicly routed REST APIs in Dataset Service

...

  1. Client --> Router HTTPcreateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPcreateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> Authorizer Thriftrevoke(dataset); grant(dataset, SecurityRequestContext.userId, ALL)
  5. DatasetService --> DatasetOpExecutor HTTPsuccess = doAs(namespace, createDataset(dataset))
  6. DatasetService --> Authorizer Thrift!success ? revoke(dataset)
  7. DatasetService --> Router --> Client HTTPresult

...

  1. Client --> Router HTTPgetDataset(dataset)
  2. Router --> DatasetService HTTPdataset = getDataset(dataset, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: result = filter(dataset, SecurityRequestContext.userId)
  4. DatasetService -->  Router --> Client HTTPresult.isEmpty ? UnauthorizedException

Update

  1. Client --> Router HTTPupdateDataset(dataset, type, properties)
  2. Router --> DatasetService HTTPupdateDataset(dataset, type, properties, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetService: result = update(dataset, type, properties)
  5. DatasetService --> Router --> Client HTTPresult

Truncate

...

  1. Client --> Router HTTPtruncate(dataset)
  2. Router --> DatasetService HTTPtruncate(ds, SecurityRequestContext.userId)
  3. DatasetService --> AuthEnforcer: !authorized(SecurityRequestContext.userId) ? UnauthorizedException
  4. DatasetService --> DatasetOpExecutor HTTPresult = doAs(namespace, truncate(dataset))
  5. DatasetService --> Router --> Client HTTPresult

...

Publicly routed REST APIs in Stream Service

Program Runtime

Access datasets, streams and secure keys

During program runtimes, users can access datasets, streams and secure keys through program APIs (MapReduce/Spark/Flows) or through Dataset APIs (getDataset)

Administer datasets, streams and secure keys

During program runtimes, users can administer datasets, streams and secure keys via the Admin APIs 

Update system metadata

During program runtimes, CDAP performs various system operations for:

  • Recording Audit
  • Recording Lineage
  • Recording Usage
  • Recording Run Records
  • Namespace Lookup
  • Authorization Enforcement

Explore

Access datasets and streams

Users can execute Hive SELECT (for BatchReadable datasets) and INSERT (for BatchWritable datasets queries via Explore to access data in datasets and streams.

Administer datasets and streams

Create operations on datasets and streams can create tables in Hive if explore is enabled. Similarly, delete can drop and truncate tables.

Authorization Cache Updates

Scratch Pad

a) Authorization

...