Goals
Key Management
- Secure impersonation
- Authorization of dataset and stream access
- Authorization for listing and viewing entities
- Ability to map a namespace to user-provided storage provider namespaces
- Cross-namespace dataset access
- Support long-running programs in secure (kerberos) mode
Checklist
- User stories documented (Rohit/Ali/Bhooshan)
- User stories reviewed (Nitin)
- Design documented (Rohit/Ali/Bhooshan)
- Design reviewed (Andreas)
- Feature merged (Rohit/Ali/Bhooshan)
- Examples and guides (Rohit)
- Integration tests (Ali)
- Documentation for feature (Bhooshan)
- Blog post
User Stories
- As a CDAP security admin, I want CDAP programs to be run as the user running the program, and not as the headless "cdap" user. (User Impersonation)
- As a CDAP user, I would like to specify a user for a namespace and all program running in that namespace should be run as the specified user. (User Impersonation)
- As a CDAP/Hydrator security admin, I want all sensitive information like passwords not be stored in plaintext. (Key Management)
- As a CDAP security admin, I want all operations on datasets/streams to be governed by my configured authorization system. (Authorization)
- As a CDAP security admin, I want list operations for all CDAP entities to only return entities that the logged-in user is authorized to view. (Authorization)
- As a CDAP security admin, I want view operations for a CDAP entity to only succeed if the logged-in user is authorized to view that entity (Authorization)
- As a CDAP user, I would like to specify the namespace in an underlying storage provider (e.g. HBase namespace, Hive database) to use for a particular CDAP namespace. (Namespaces)
- As a CDAP admin, I want to allow users to access a dataset from a program in a different namespace, as long as the said user is authorized to access that dataset. (Namespaces)
- As a CDAP user, I want to be able to run long running Mapreduce, Spark or Hive programs on a secure (kerberos-enabled) cluster. (Namespaces)