Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

  • FileSetProperties.setUseExisting(true) (or DATA_USE_EXISTING / "data.use.existing") to reuse an existing location and Hive table. The dataset will assume that it does not own the existing data in that location and Hive table, and therefore, when you delete or truncate the dataset, the data will not be deleted. 
  • FileSetProperties.setPossessExisting(true) (or DATA_POSSESS_EXISTING / "data.possess.existing") to assume ownership an existing location and Hive table. The dataset will assume that it owns the existing data in that location and Hive table, and therefore, when you delete or truncate the dataset, all data will be deleted, including the previously existing data and Hive partitions.  

Note that in both cases, the existing partitions in the Hive table are not known to CDAP and therefore only accessible via Hive, not through PartitionedFileSet APIs

 

Cluster Configuration and Setup

To use application level impersonation in CDAP you will need to tweak some configuration of your cluster. Below is the list of changes you might have to do to ensure you cluster can run support app level impersonation in CDAP. Note some of these configuration might already exist in your environment in which case you can ignore them.

 

Enable Hbase Authorization (if needed)

Add the following to your hbase-site.xml

Code Block
titlehbase-site.xml
<property>
	<name>hbase.security.exec.permission.checks</name>
   	<value>true</value>
 </property>
 <property>
   	<name>hbase.coprocessor.master.classes</name>
   	<value>org.apache.hadoop.hbase.security.access.AccessController</value>
 </property>
 <property>
   	<name>hbase.coprocessor.region.classes</name>
	<value>org.apache.hadoop.hbase.security.token.TokenProvider,org.apache.hadoop.hbase.security.access.AccessController</value>
 </property>

You will need to restart HBase after the above configuration changes. 

Configure CDAP for App level impersonation

To support app level impersonation wherein application, datasets and streams can have their own owner and the operation performed in CDAP should impersonated their respective owner CDAP should have access to the owner principal and their associated keytabs. Owner principals of an entity is provided during the entity creation step (see REST APIs documentation in next section).