Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »


 

 

Goals

  1. Allow CDAP users to securely store sensitive data.
  2. Allow authorized CDAP users to access stored data at runtime.
  3. Allow authorized CDAP users to manage the stored data.

Checklist

  • User stories documented (Nishith)
  • User stories reviewed (Nitin)
  • Design documented (Nishith)
  • Design reviewed (Andreas/Terence)
  • Feature merged (Nishith)
  • Blog post 

User Stories

  1. As a CDAP/Hydrator security admin, I want all sensitive information like passwords not be stored in plaintext.

 

Brief introduction to Hadoop KMS

Hadoop KMS is a cryptographic key management server based on Hadoop’s KeyProvider API.

It provides a client and a server components which communicate over HTTP using REST API.

The client is a KeyProvider implementation which interacts with the KMS using the KMS HTTP REST API.

The KMS is a proxy that interfaces with a backing key store on behalf of HDFS daemons and clients. Both the backing key store and the KMS implement the Hadoop KeyProvider API. A default Java Key store is provided for testing but is not recommended for production use. Cloudera provides Navigator Key Trustee for production clusters. Hortonworks recommends using Ranger KMS.



*Image taken from Cloudera engineering Blog


Design

 

The entity stored will be composed of three parts

  1. Alias: This will be the identifier, provided by the user, that will be used to retrieve the object.
  2. Properties: A key value map containing the properties of the object being stored.
  3. Data: The data being stored. Passed in as a byte array.

 

Design decisions

  1. Hadoop KMS supports versioning for the keys it stores. This is used mainly for key rollovers. In this release, we won't support versioning.

 

Following operations will supported by the store

  • Store
  • Get data
  • Get metadata
  • Get metadata list
  • List
  • Delete

 

The system will expose APIs to clients

 

Secure Store Programmatic API
// Represents the metadata about the data
interface SecureStoreMetaData {
  String getName();
  String getDescription();
  long getLastModifiedTime();
  Map<String, String> getProperties();
}
 
// Represents the secure data
interface SecureStoreData {
  // Returns the meta data about the secure data
  SecureStoreMetaData getMetaData();
 
  // Returns the secure data
  byte[] get();
}
 
// Provides read-only access to secure store
interface SecureStore {
  // Returns a list of available secure data in the secure store.
  List<String> list();
  // Returns a list of metadata objects for the list of data items
  List<SecureStoreMetaData> getMetadata(List<String> data);
 
  // Gets the secure data
  SecureStoreData get(String name);
}
 
// Manager interface for managing secure data
interface SecureStoreManager {
  // Stores the secure data
  void put(String name, byte[] data, Map<String, String> properties);
 
  // Remove the secure data
  void delete(String name);
}

 

REST API

OperationREST APIBodyResponse
PutPOST /security/store/v1/key

Content-Type: application/json

Put Data
{
  "name"        :  "<name>"
  "description" :  "<description>"
  "data"        :  "<data>"  //base64
  "properties"  :  {
    "key"  :  "value"
	...
  }
}

201 Created

DeleteDELETE /security/store/v1/key/<key-name>N/A

200 OK

404 Not Found

GetGET /security/store/v1/key/<key-name>N/A

200 OK

Content-Type: application/json

{
  "name"  :  "<name>"
  "data"  :  "<data>"  //base64
}

404 Not Found

Get MetadataGET /security/store/v1/key/<key-name>/metadataN/A

200 OK

Content-Type: application/json

{
  "name"        :  "<name>"
  "description" :  "<description>"
  "created"     :  <millis-epoch> //long
  "properties"  :  {
    "key"  :  "value"
	...
  }
}

404 Not Found

ListGET /security/store/v1/keys/namesN/A

200 OK

Content-Type: application/json

[
  "<key-name>",
  "<key-name>",
  "<key-name>",
  ...
]
Get multiple MetadataGET /security/store/v1/keys/metadata?key=<key-name>&key=<key-name>,...N/A

200 OK

Content-Type: application/json

[
  {
    "name"        :  "<name>"
    "description" :  " <description>"
    "created"     :  <millis-epoch>   //long
    "properties"  :  {
      "key"  :  "value"
	  ...
    }
  }
  {
    "name"        :  "<name>"
    "description" :  "<description>"
    "created"     :  <millis-epoch> //long
    "properties"  :  {
      "key"  :  "value"
	  ...
    }
  }
]

 

 

Access Control

The secure store can be protected with a key in the CDAP master keystore, which CDAP already requires the user to provide in order to have SSL enabled. Since the program will be executed in the same JVM as the SDK process, access to the sensitive data can be done directly through the proper Guice binding that binds the SecureStore interface to the actual implementation.

KMS uses Hadoop Authentication for HTTP authentication. Hadoop Authentication issues a signed HTTP Cookie once the client has authenticated successfully.

Caching

Hadoop KMS caches keys for a short period of time to avoid excessive hits to the underlying key provider. In the operations we are interested in, only 2 use the cache, get data, and get metadata.

Audit logs

All access to the secure store will be logged. 

Audit logs are aggregated by KMS for API accesses to the GET_KEY_VERSION, GET_CURRENT_KEY, DECRYPT_EEK, GENERATE_EEK operations.

Entries are grouped by the (user,key,operation) combined key for a configurable aggregation interval after which the number of accesses to the specified end-point by the user for a given key is flushed to the audit log.

 

Implementation

Following two implementations will be provided

Standalone mode

An implementation using standard Java tools (JKS or JCEKS) will be provided. The secure store will be kept in an encrypted file on the local filesystem.

Distributed mode

The cluster has KMS running

If the cluster has KMS running, we will utilize that for securely storing sensitive information. To do that we will implement the Hadoop KeyProvider API and forward user calls to KMS. The API with the methods that need to be implemented are listed below.

The cluster does not have KMS running

This mode will not be supported in this release.

 

 

JavaSecureStoreProvider
//Implementation needs to be thread safe
public class JavaSecureStoreProvider extends KeyProvider {
  private JavaSecureStoreProvider(URI uri, Configuration conf) throws IOException {
    //Get the file path for local storage
    //Get the password for the secure store
    //Load or create the store
  }
 
  //Since we are not supporting versioning, the KeyVersion will always be current
  public KeyVersion getKeyVersion(String versionName) throws IOException {
  }
 
  //Lists all the keys that is accessible to this user.
  public List<String> getKeys() throws IOException {
  }
 
  //Since we are not supporting versioning, this will only have one item
  public List<KeyVersion> getKeyVersions(String name) throws IOException{
  }
 
  public Metadata getMetadata(String name) throws IOException {
  }
 
  public KeyVersion createKey(String name, byte[] material,  Options options) throws IOException {
  }
  
  public void deleteKey(String name) throws IOException {
  }
 
  //No-op for this version
  public KeyVersion rollNewVersion(String name, byte[] material) throws IOException {
  }
 
  public void flush() throws IOException{
  }
  public static class Factory extends KeyProviderFactory {
	@Override
    public KeyProvider createProvider(URI providerName,
                                      Configuration conf) throws IOException {
	}
  }
}

 

 

Out-of-scope User Stories (4.0 and beyond)

  1. Support for secure store in distributed mode when KMS is not present.

References

Secure Store

https://hadoop.apache.org/docs/stable/hadoop-kms/index.html

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html

https://hadoop.apache.org/docs/r2.7.2/api/org/apache/hadoop/crypto/key/KeyProvider.html

 

  • No labels