...
- Tables in the corresponding HBase namespace to create Table-based datasets
- If you provide a custom HBase namespace when creating the namespace, it is your responsibility to ensure that every application principal can create tables in this namespace.
- in hbase shell:
grant '<user>', 'AC', '@<namespace>'
- or
grant '@<group>', 'AC', '@<namespace>'
- If you let CDAP create the namespace, it will use the group name specified in the namespace configuration to issue the
grant '@<group>', 'AC', '@<namespace>'
. In this case it is necessary that all application owners are in that group.
- Tables in the namespace's Hive database, to be able to enable Explore for datasets. Depending on the Hive authorization settings:
- The application user must be privileged to create tables in the database
- Hive must be configured to grant all privileges to the user that creates a table (depending on Hive configuration, this may not be the case)
- For any sharing between applications that requires additional permissions, these must be granted manually.
...
- FileSetProperties.setUseExisting(true) (or DATA_USE_EXISTING / "data.use.existing") to reuse an existing location and Hive table. The dataset will assume that it does not own the existing data in that location and Hive table, and therefore, when you delete or truncate the dataset, the data will not be deleted.
- FileSetProperties.setPossessExisting(true) (or DATA_POSSESS_EXISTING / "data.possess.existing") to assume ownership an existing location and Hive table. The dataset will assume that it owns the existing data in that location and Hive table, and therefore, when you delete or truncate the dataset, all data will be deleted, including the previously existing data and Hive partitions.
...
Add the following to your hbase-site.xml
Code Block |
---|
language | xml |
---|
title | hbase-site.xml |
---|
|
<property>
<name>hbase.security.exec.permission.checks</name>
<value>true</value>
</property>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.hadoop.hbase.security.access.AccessController</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
</property> |
...
- All keytabs must be present on the local filesystem on which CDAP Master is running.
- These keytabs must be present under a path which can be in one of the following formats and cdap should have read access on all the keytabs:
- /dir1>/<dir2>/${name}.keytab
- /dir1>/<dir2>/${name}/${name}.keytab
The above path is provided to CDAP as a configuration parameter in cdap-site.xml for example:
Code Block |
---|
language | xml |
---|
title | cdap-site.xml |
---|
|
<property>
<name>security.keytab.path</name>
<value>/etc/security/keytabs/${name}.keytab</value>
</property> |
Where ${name} will be replaced by CDAP by the short user name of the kerberos principal CDAP is impersonating.
Note: You will need to restart CDAP for the configuration changes to take effect.
...
Add the following to your hive-site.xml and restart hive:
Code Block |
---|
language | xml |
---|
title | hive-site.xml |
---|
|
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.users.in.admin.role</name>
<value>hive,cdap</value>
</property>
<property>
<name>hive.security.authorization.manager</name>
<value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
</property>
<property>
<name>hive.security.authorization.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.security.authenticator.manager</name>
<value>org.apache.hadoop.hive.ql.security.ProxyUserAuthenticator</value>
</property> |
...
Note your hive-site.xml should also be configured to support modifying properties at runtime. Specifically, you will need the following configuration in your hive-site.xml
Code Block |
---|
language | xml |
---|
title | hive-site.xml |
---|
|
<property>
<name>hive.security.authorization.sqlstd.confwhitelist.append</name>
<value>explore.*|mapreduce.job.queuename|mapreduce.job.complete.cancel.delegation.tokens|spark.hadoop.mapreduce.job.complete.cancel.delegation.tokens|mapreduce.job.credentials.binary|hive.exec.submit.local.task.via.child|hive.exec.submitviachild<.submitviachild|hive.lock.*</value>
</property> |
Hive Proxy Users
To enable Hive to If you do not use SQL-based authorization, you may want to configure Hive to be able to impersonate other users set . Set the following in hive-site.xml
Code Block |
---|
language | xml |
---|
title | hive-site.xml |
---|
|
<property>
<name>hive.server2.enable.doAs</name>
<value>true</value>
</property> |
Make sure that Hive is configured Note that CDAP's Explore service ignores this setting and needs to be able to impersonate users who can create/access entities in CDAP. This can by done by adding the following property in your core-site.xml. The first option allows Hive CDAP to impersonate users belonging to "group1" and "group2" and the second option allows Hive to impersonate on all hosts.
Code Block |
---|
language | xml |
---|
title | core-site.xml |
---|
|
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>group1,group2</value>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property> |
...
Creating application from an existing artifact:
Code Block |
---|
language | bash |
---|
title | creating application REST API |
---|
|
curl -v -X PUT http://hostname.net:11015/v3/namespaces/{namespace-id}/apps/{app-id} -d '{"artifact":{"name":"{artifact-name}","version":"{artifact-version}","scope":"USER"},"principal":"someuser/somehost.net@SOMEKDC.NET"}' -H "Authorization: Bearer your_access_token" |
...
Creating a stream with an owner:
Code Block |
---|
language | bash |
---|
title | creating stream REST API |
---|
|
curl -X PUT -v http://somehost.net:11015/v3/namespaces/{namespace-id}/streams/{stream-name} -d '{ "ttl": 1, "principal": "someuser/somehost.net@SOMEKDC.NET" }' -H "Authorization: Bearer your_access_token" |
...
Creating a dataset with owner:
Code Block |
---|
language | bash |
---|
title | creating dataset REST API |
---|
|
curl -v -X PUT http://somehost.net:11015/v3/namespaces/{namespace-id}/data/datasets/{dataset-id} -d '{ "typeName": "table", "properties": {}, "principal": "someuser/somehost.net@SOMEKDC.NET" }' -H "Authorization: Bearer your_access_token" |
Querying dataset properties for owner information:
Code Block |
---|
language | bash |
---|
title | querying dataset REST API |
---|
|
curl -v http://hostname.net:11015/v3/namespaces/{namespace-id}/data/datasets/{dataset-name} -H "Authorization: Bearer your_access_token" |
...