Table of Contents |
---|
...
- User stories documented (Ali)
- User stories reviewed (Nitin)
- Design documented (Ali)
- Design reviewed (Andreas/Terence)
- Feature merged (Ali)
- Blog post
...
Hadoop's UserGroupInformation class has the following method:
// Log a user in from a keytab file.
UserGroupInformation loginUserFromKeytabAndReturnUGI(String user, String path);
...
Design of the necessary implementation for this has not been flushed out either, and will come later.
Brief summary of overall changes
- During program runtime, cdap master will impersonate a user and launch the YARN app. This will make it so that cdap programs run as various users.
- Because these users will not have access to system tables, they will go through CDAP system services for writing to system tables (run records, lineage, usage, workflow token).
- During namespace operations (create/delete), dataset service will perform the namespace create and delete operations (HBase namespace, HDFS directories, explore database), while impersonating the configured user.
- During dataset admin operations (create/delete/truncate), dataset op executor service will perform the operations while impersonating the configured user.
- (to be finalized) Stream admin operations as well as stream writing operations will have to happen while impersonating the configured user.
- (to be finalized) Explore queries launched will have to happen while impersonating the configured user.
- (to be finalized) Artifact deployment will also need to impersonate the user, when deploying artifact in user scope.
Note: any time that a system service wishes to impersonate a user, it will involve looking up the configured principal/keytab, then localizing the keytab from distributed file system, and creating a UGI based upon this keytab. A caching mechanism for these UGI's would be useful.
Problems Encountered
User applications writing to CDAP System tables
...