Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
 

...

Hadoop's UserGroupInformation class has the following method:

// Log a user in from a keytab file.
UserGroupInformation loginUserFromKeytabAndReturnUGI(String user, String path);

...

A similar approach can be done for programs launched by a scheduler. The only difference would be that the principal and credentials would be resolved by the scheduler, instead.

 

System-executed operations on user data (dataset admin ops and namespace ops)

When the CDAP system performs dataset operations (create/delete/truncate/upgrade hbase tables, for instance), it is acting on user datasets. Because of this and the fact that we do not want the cdap system user to have superuser privileges, we need to impersonate users when executing these dataset admin operations.
To implement this, we'll have a DelegatingDatasetAdmin which will perform all of its operations for a particular UGI.
StorageProviderNamespaceAdmin will also have to perform all of its operations for a particular UGI (i.e. namespace create and namespace delete).


Upgrade Tool changes (TBD)

Very likely, upgrade tool will also have to follow a similar pattern as dataset op executor service.
Other miscellaneous tools that interact with user data: Flowlet pending metrics corrector, Flowlet queue inspector.


Streams (TBD)

StreamWriters are system code, but writing to user Streams, so this should also be impersonated.
It is not yet determined how impersonation will work here, but the above approach can not be used in this case.
An implementation of design for this will be flushed out later. A couple of things to consider when thinking about the design later:

...