Table of Contents |
---|
Overview
This is a runbook for Security features in CDAP 3.5. It contains instructions for setting up tests for various scenarios described in Security-Impersonation-Namespace Mapping Scenarios
...
- Impersonation allows CDAP to launch CDAP programs, hive queries as the user (configured at the namespace level). The user principal and keytabURI can be provided when creating the namespace
- Keytab file should be put in HDFS or in the local filesystem on the master node and at the minimum, 'cdap' user should be able to read that file.
- When impersonation is used all the operations related to that namespace, such as - namespace creation (including custom mapping), program run, hive query, hbase table read/write/create/delete, hdfs directory create/read/write are all performed as that user.
- Since hdfs root directory creation happens by default under the /cdap/namespaces directory, impersonation will not work unless i) custom mapping for hdfs is set and RWX permissions are set for that user on that directory or ii) the user has RWX permissions for the /cdap/namespaces directory. Option i) is recommended!
- If authorization is also enabled, when the programs (such as flows, mapreduce) try to create/read/write datasets, permissions authorization permission for that user is checked to make sure they have the sufficient privileges to do such operations. This also applies when dataset in other namespaces are accessed (the impersonated user should have sufficient access to access dataasets in the other namespace).
Setting up a namespace, with impersonation, hbase-mapping and file system-mapping
Note: Hive impersonation is not yet complete. All Hive (Explore) operations are still done as the cdap user.
To set up a namespace with impersonation configured, an hbase namesace mapping, and a file system mapping, make an HTTP request to create namespace :PUT <HostAndPort>as below.
Note that the configured "root.directory" must be an existing directory (on HDFS, if running distributed CDAP) and "hbase.namespace" must also be a pre-existing HBase namespace.
Code Block | ||||
---|---|---|---|---|
| ||||
PUT <HostAndPort>/v3/namespaces/<namespace-id> |
...
with the following body. Make sure to |
...
replace the attributes within <>, such as <principal>. {"name":"<namespace-id>","description":"<namespace description>","config":{"principal":"<principal>","keytabURI":"<path-to-keytab-file> |
...
", "root.directory":"<file-system-dir>", "hbase.namespace":"<hbase-namespace>"}} |
As an example (using curl):
Code Block | ||||
---|---|---|---|---|
| ||||
curl -v -X PUT <HostAndPort>/v3/namespaces/foo -d '{"name":"foo","description":"My foo namespace","config":{"principal":"<PRINCIPAL>","keytabURI":"<KEYTAB_TOKEN>", "root.directory":"/tmp/foo", "hbase.namespace":"foo_ns"}}' -H "Authorization: Bearer <TOKEN>"
|
Now, operations within that namespace should happen with the configured principal and keytab.
Authorization
- Authorization allows admin to restrict operations that can be performed by a user in CDAP - i.e., explicit permissions need to be provided to users to perform ADMIN, READ, WRITE operations on CDAP entities.
- Authorization without impersonation has limited usage since the programs, when started, run as 'cdap' user and thus any authorization check for dataset access is done for 'cdap' user even though a different user started the program. That is, any operation performed outside of CDAP programs are authorization-enforced as that user.
- Apache Sentry can be integrated as the ACL management and enforcer tool for CDAP entities.
- Note: Stream authorization is still WIP and thus authorization won't be enforced for any operations on streams.
...
- Set
security.enabled
, andsecurity.authorization.enabled
to true - To install Sentry, set up a CDAP cluster using Cloudera Manager, and add the Sentry Service to it.
- Follow instructions to setup CDAP with Apache Sentry as the authorization backend.
...