Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
 

...

Hadoop's UserGroupInformation class has the following method:

// Log a user in from a keytab file.
UserGroupInformation loginUserFromKeytabAndReturnUGI(String user, String path);

...

A similar approach can be done for programs launched by a scheduler. The only difference would be that the principal and credentials would be resolved by the scheduler, instead.


Streams (TBD)

StreamWriters are system code, but writing to user Streams, so this should also be impersonated.
It is not yet determined how impersonation will work here, but the above approach can not be used in this case.
An implementation of design for this will be flushed out later. A couple of things to consider when thinking about the design later:

  1. Multiple delegation tokens in a StreamWriter, in order to handle multiple users' streams?
  2. What is the cost of switcher user from the StreamWriters (performance impact)?
  3. Running Writers in separate containers, to avoid cost of switching?


Launching of flows (TBD)

When a flow program is launched for the first time, CDAP Master will create an HBase table in the user's namespace to track pending events of queues (which events a particular flowlet has processed, and which are unprocessed). During execution of the flow's flowlets, the flowlets will read and update this table. Because of this, the hbase table should be created by the user that launches the flow, or at least readable and writable by that user.
Design of the necessary implementation for this has not been flushed out either, and will come later.

...