Table of Contents |
---|
Goals
Authorize a subset of operations on CDAP entities using Apache Sentry
Make the authorization system pluggable. Support the following two systems to begin with:
Sentry based
CDAP Dataset based
Checklist
- User stories documented (Rohit/Bhooshan)
- User stories reviewed (Nitin)
- Design documented (Rohit/Bhooshan)
- Design reviewed (Andreas)
- Feature merged (Rohit/Bhooshan)
- Examples and guides (Rohit)
- Integration tests (Bhooshan)
- Documentation for feature (Rohit/Bhooshan)
- Blog post
...
User Stories
- As a CDAP system, I should be able to integrate with Apache Sentry for fine-grained role-based access controls of select CDAP operations
- As a CDAP admin, I should be able to easily configure Sentry to work with CDAP on different type of cluster (ex: CDH, CM cluster etc).
- As a CDAP admin, I should be able to create/update/delete roles in Apache Sentry
- As a CDAP admin, I should be able to add users/groups to roles in Apache Sentry
- As a CDAP admin, I should be able to turn authorization on/off easily for entire CDAP instance
- As a CDAP system, I should be able to authorize the following requests
- Namespace create/update/delete
- Application deployment
- Program start/stop
- Stream read/write
These operations are a subset that represents the various 'kinds' of operations allowed in CDAP
Scenarios
Scenario #1
- D-Rock is an IT-Admin extra-ordinaire who has just been tasked with adding authorizing access to entities in CDAP on the cluster he manages.
- D-Rock is already familiar with Apache Sentry, since he has used it for authorization in other projects like Apache HDFS, Apache Hive, Apache Sqoop, etc.
- He would rather not learn a new authorization system. He would instead prefer that Apache Sentry be used to provide Role Based Access Control to CDAP entities as well.
- As part of this, he would also like a streamlined installation and configuration experience with Apache Sentry and CDAP, including detailed instructions.
Scenario #2
- D-Rock manages a variety of CDAP clusters in dev/smoke/qa/staging environments along with the prod environment.
- For these environments, he would like to be able to turn authorization on/off easily with a switch for the CDAP instance, depending on the need at a given time.
Scenario #3
- Ideally, D-Rock would like to be able to authorize all operations on all entities in CDAP.
- However, this can be rolled out in phases. In the initial phase, he would like to control who can:
- Create/update/delete a namespace
- Only users with WRITE permission on CDAP instance should be able to perform this operation.
- A property in
sentry-site.xml
will decide a set of users who have admin permission on cdap instance. These admins can then later grant permissions to other users.
- Deploy an application in a namespace
- Only users with WRITE permission on the namespace should be able to perform this operation
- One the application is deployed the the user who deployed becomes the ADMIN of the application.
- Start/stop a program
- Only users with READ permission on the namespace and application, and EXECUTE permission on the program should be able to perform this operation
- Only users with ADMIN permission on the program can set preference for the program
- Only users with WRITE permission can provide runtime args
- Read/write to a stream
- Only users with READ privilege on the namespace and READ permission on the stream should be able to read from the stream
- Only users with READ privilege on the namespace and WRITE permission on the stream should be able to write to the stream
- Note: We have decided not to handle views separately. A user have same permission on all views of a stream as what it has on the stream.
- Create/update/delete a namespace
Entities, Operations and Privileges
Entity | Operation | Required Privileges | Resultant Privileges |
---|---|---|---|
Namespace | create | ADMIN (Instance) | ADMIN (Namespace) |
update | ADMIN (Namespace) | ||
list | READ (Instance) | ||
get | READ (Namespace) | ||
delete | ADMIN (Namespace) | ||
set preference | WRITE (Namespace) | ||
get preference | READ (Namespace) | ||
search | READ (Namespace) | ||
Artifact | add | WRITE (Namespace) | ADMIN (Artifact) |
delete | ADMIN (Artifact) | ||
get | READ (Artifact) | ||
list | READ (Namespace) | ||
write property | ADMIN (Artifact) | ||
delete property | ADMIN (Artifact) | ||
get property | READ (Artifact) | ||
refresh | WRITE (Instance) | ||
write metadata | ADMIN (Artifact) | ||
read metadata | READ (Artifact) | ||
Application | deploy | WRITE (Namespace) | ADMIN (Application) |
get | READ (Application) | ||
list | READ (Namespace) | ||
update | ADMIN (Application) | ||
delete | ADMIN (Application) | ||
set preference | WRITE (Application) | ||
get preference | READ (Application) | ||
add metadata | ADMIN (Application) | ||
get metadata | READ (Application) | ||
Programs | start/stop/debug | EXECUTE (Program) | |
set instances | ADMIN (Program) | ||
list | READ (Namespace) | ||
set runtime args | EXECUTE (Program) | ||
get runtime args | READ (Program) | ||
get instances | READ (Program) | ||
set preference | ADMIN (Program) | ||
get preference | READ (Program) | ||
get status | READ (Program) | ||
get history | READ (Program) | ||
add metadata | ADMIN (Program) | ||
get metadata | READ (Program) | ||
emit logs | WRITE (Program) | ||
view logs | READ (Program) | ||
emit metrics | WRITE (Program) | ||
view metrics | READ (Program) | ||
Streams | create | WRITE (Namespace) | ADMIN (Stream) |
update properties | ADMIN (Stream) | ||
delete | ADMIN (Stream) | ||
truncate | ADMIN (Stream) | ||
enqueue asyncEnqueue batch | WRITE (Stream) | ||
get | READ (Stream) | ||
list | READ (Namespace) | ||
read events | READ (Stream) | ||
set preferences | ADMIN (Stream) | ||
get preferences | READ (Stream) | ||
add metadata | ADMIN (Stream) | ||
get metadata | READ (Stream) | ||
view lineage | READ (Stream) | ||
emit metrics | WRITE (Stream) | ||
view metrics | READ (Stream) | ||
Datasets | list | READ (Namespace) | |
get | READ (Dataset) | ||
create | WRITE (Namespace) | ADMIN (Dataset) | |
update | ADMIN (Dataset) | ||
drop | ADMIN (Dataset) | ||
executeAdmin (exists/truncate/upgrade) | ADMIN (Dataset) | ||
add metadata | ADMIN (Dataset) | ||
get metadata | READ (Dataset) | ||
view lineage | READ (Dataset) | ||
emit metrics | WRITE (Dataset) | ||
view metrics | READ (Dataset) |
NOTE: Cells marked green are in scope for 3.4
Design
This feature can be broken down into the following main parts, in no specific order:
Authorization in CDAP
The authorization system in CDAP will be pluggable, and the backend can be provided by external systems like Apache Sentry/Ranger. It provides:
- Authorization Enforcement hooks during various operations within CDAP, that throw
AuthorizationException
if the operation is not authorized. - ACL Management
This system exposes a set of interfaces defined below.
AuthEnforcer
The AuthEnforcer
interface provides a way to check if an operation is authorized. At various points in the CDAP code (NamespaceHttpHandler, AppLifecycleHttpHandler, ProgramLifecycleHttpHandler, StreamHandler in 3.4), this interface will be used to check if an operation is authorized.
...
theme | Confluence |
---|---|
language | java |
title | AuthChecker Interface |
firstline | 1 |
linenumbers | true |
...
Table of Contents |
---|
Goals
Authorize a subset of operations on CDAP entities using Apache Sentry
Make the authorization system pluggable. Support the following two systems to begin with:
Sentry based
CDAP Dataset based
Checklist
- User stories documented (Rohit/Bhooshan)
- User stories reviewed (Nitin)
- Design documented (Rohit/Bhooshan)
- Design reviewed (Andreas)
- Feature merged (Rohit/Bhooshan)
- Examples and guides (Rohit)
- Integration tests (Bhooshan)
- Documentation for feature (Rohit/Bhooshan)
- Blog post
...
User Stories
- As a CDAP system, I should be able to integrate with Apache Sentry for fine-grained role-based access controls of select CDAP operations
- As a CDAP admin, I should be able to easily configure Sentry to work with CDAP on different type of cluster (ex: CDH, CM cluster etc).
- As a CDAP admin, I should be able to create/update/delete roles in Apache Sentry
- As a CDAP admin, I should be able to add users/groups to roles in Apache Sentry
- As a CDAP admin, I should be able to turn authorization on/off easily for entire CDAP instance
- As a CDAP system, I should be able to authorize the following requests
- Namespace create/update/delete
- Application deployment
- Program start/stop
- Stream read/write (Not Implemented in 3.4)
These operations are a subset that represents the various 'kinds' of operations allowed in CDAP
Scenarios
Scenario #1
- D-Rock is an IT-Admin extra-ordinaire who has just been tasked with adding authorizing access to entities in CDAP on the cluster he manages.
- D-Rock is already familiar with Apache Sentry, since he has used it for authorization in other projects like Apache HDFS, Apache Hive, Apache Sqoop, etc.
- He would rather not learn a new authorization system. He would instead prefer that Apache Sentry be used to provide Role Based Access Control to CDAP entities as well.
- As part of this, he would also like a streamlined installation and configuration experience with Apache Sentry and CDAP, including detailed instructions.
Scenario #2
- D-Rock manages a variety of CDAP clusters in dev/smoke/qa/staging environments along with the prod environment.
- For these environments, he would like to be able to turn authorization on/off easily with a switch for the CDAP instance, depending on the need at a given time.
Scenario #3
- Ideally, D-Rock would like to be able to authorize all operations on all entities in CDAP.
- However, this can be rolled out in phases. In the initial phase, he would like to control who can:
- Create/update/delete a namespace
- Only users with WRITE permission on CDAP instance should be able to perform this operation.
- A property in
sentry-site.xml
will decide a set of users who have admin permission on cdap instance. These admins can then later grant permissions to other users.
- Deploy an application in a namespace
- Only users with WRITE permission on the namespace should be able to perform this operation
- One the application is deployed the the user who deployed becomes the ADMIN of the application.
- Start/stop a program
- Only users with READ permission on the namespace and application, and EXECUTE permission on the program should be able to perform this operation
- Only users with ADMIN permission on the program can set preference for the program
- Only users with WRITE permission can provide runtime args
- Read/write to a stream
- Only users with READ privilege on the namespace and READ permission on the stream should be able to read from the stream
- Only users with READ privilege on the namespace and WRITE permission on the stream should be able to write to the stream
- Note: We have decided not to handle views separately. A user have same permission on all views of a stream as what it has on the stream.
- Create/update/delete a namespace
Entities, Operations and Privileges
Entity | Operation | Required Privileges | Resultant Privileges |
---|---|---|---|
Namespace | create | ADMIN (Instance) | ADMIN (Namespace) |
update | ADMIN (Namespace) | ||
list | READ (Instance) | ||
get | READ (Namespace) | ||
delete | ADMIN (Namespace) | ||
set preference | WRITE (Namespace) | ||
get preference | READ (Namespace) | ||
search | READ (Namespace) | ||
Artifact | add | WRITE (Namespace) | ADMIN (Artifact) |
delete | ADMIN (Artifact) | ||
get | READ (Artifact) | ||
list | READ (Namespace) | ||
write property | ADMIN (Artifact) | ||
delete property | ADMIN (Artifact) | ||
get property | READ (Artifact) | ||
refresh | WRITE (Instance) | ||
write metadata | ADMIN (Artifact) | ||
read metadata | READ (Artifact) | ||
Application | deploy | WRITE (Namespace) | ADMIN (Application) |
get | READ (Application) | ||
list | READ (Namespace) | ||
update | ADMIN (Application) | ||
delete | ADMIN (Application) | ||
set preference | WRITE (Application) | ||
get preference | READ (Application) | ||
add metadata | ADMIN (Application) | ||
get metadata | READ (Application) | ||
Programs | start/stop/debug | EXECUTE (Program) | |
set instances | ADMIN (Program) | ||
list | READ (Namespace) | ||
set runtime args | EXECUTE (Program) | ||
get runtime args | READ (Program) | ||
get instances | READ (Program) | ||
set preference | ADMIN (Program) | ||
get preference | READ (Program) | ||
get status | READ (Program) | ||
get history | READ (Program) | ||
add metadata | ADMIN (Program) | ||
get metadata | READ (Program) | ||
emit logs | WRITE (Program) | ||
view logs | READ (Program) | ||
emit metrics | WRITE (Program) | ||
view metrics | READ (Program) | ||
Streams | create | WRITE (Namespace) | ADMIN (Stream) |
update properties | ADMIN (Stream) | ||
delete | ADMIN (Stream) | ||
truncate | ADMIN (Stream) | ||
enqueue asyncEnqueue batch | WRITE (Stream) | ||
get | READ (Stream) | ||
list | READ (Namespace) | ||
read events | READ (Stream) | ||
set preferences | ADMIN (Stream) | ||
get preferences | READ (Stream) | ||
add metadata | ADMIN (Stream) | ||
get metadata | READ (Stream) | ||
view lineage | READ (Stream) | ||
emit metrics | WRITE (Stream) | ||
view metrics | READ (Stream) | ||
Datasets | list | READ (Namespace) | |
get | READ (Dataset) | ||
create | WRITE (Namespace) | ADMIN (Dataset) | |
update | ADMIN (Dataset) | ||
drop | ADMIN (Dataset) | ||
executeAdmin (exists/truncate/upgrade) | ADMIN (Dataset) | ||
add metadata | ADMIN (Dataset) | ||
get metadata | READ (Dataset) | ||
view lineage | READ (Dataset) | ||
emit metrics | WRITE (Dataset) | ||
view metrics | READ (Dataset) |
NOTE: Cells marked green are in scope for 3.4
Design
This feature can be broken down into the following main parts, in no specific order:
Authorization in CDAP
The authorization system in CDAP will be pluggable, and the backend can be provided by external systems like Apache Sentry/Ranger. It provides:
- Authorization Enforcement hooks during various operations within CDAP, that throw
AuthorizationException
if the operation is not authorized. - ACL Management
This system exposes a set of interfaces defined below.
AuthEnforcer
The AuthEnforcer
interface provides a way to check if an operation is authorized. At various points in the CDAP code (NamespaceHttpHandler, AppLifecycleHttpHandler, ProgramLifecycleHttpHandler, StreamHandler in 3.4), this interface will be used to check if an operation is authorized.
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
interface AuthEnforcer {
/**
* Enforces authorization for the specified {@link Principal} for the specified {@link Action} on the specified {@link EntityId}.
*
* @param principal the principal that performs the actions. This could be a user, group or a role
* @param entity the entity on which an action is being performed
* @param action the action being performed
* @throws AuthorizationException if the principal is not authorized to perform action on the entity
*/
void enforce(Principal principal, EntityId entity, Action action) throws AuthorizationException;
} |
Authorizer
This interface allows CDAP admins to grant/revoke permissions for specific operations on specific CDAP entities to specified Principals. It will be used by the ACL Management module, which may or may not reside in CDAP for the purposes of integration with Apache Sentry TBD.
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
public interface Authorizer { /** * Initialize the {@link Authorizer}. Authorization extensions can use this method to access an * {@link AuthorizationContext} that allows them to interact with CDAP for operations such as creating and accessing * datasets, executing dataset operations in transactions, etc. * * @param context the {@link AuthorizationContext} that can be used to interact with CDAP */ void initialize(AuthorizationContext context) throws Exception; /** * Enforces authorization for the specified {@link Principal} for the specified {@link Action} on the specified * {@link EntityId}. * * @param entity the {@link EntityId} on which authorization is to be enforced * @param principal the {@link Principal} that performs the actions * @param action the {@link Action} being performed * @throws UnauthorizedException if the principal is not authorized to perform action on the entity * @throws Exception if any other errors occurred while performing the authorization enforcement check */ void enforce(EntityId entity, Principal principal, Action action) throws Exception; /** * Grants a {@link Principal} authorization to perform a set of {@link Action actions} on an {@link EntityId}. * * @param entity the {@link EntityId} to whom {@link Action actions} are to be granted * @param principal the {@link Principal} that performs the actions. This could be a user, or role * @param actions the set of {@link Action actions} to grant. */ void grant(EntityId entity, Principal principal, Set<Action> actions) throws Exception; /** * Revokes a {@link Principal principal's} authorization to perform a set of {@link Action actions} on * an {@link EntityId}. * * @param entity the {@link EntityId} whose {@link Action actions} are to be revoked * @param principal the {@link principalPrincipal} that performs the actions. This could be a user, group or a role role * @param actions the set of {@link Action actions} to revoke * @param entity the entity on which an action is being performed * @param action the action being performed / void revoke(EntityId entity, Principal principal, Set<Action> actions) throws Exception; /** * Revokes all {@link Principal principals'} authorization to perform any {@link Action} on the given * @throws AuthorizationException if the principal is not authorized to perform action on the entity */ void enforce(Principal principal, EntityId entity, Action action) throws AuthorizationException; } |
Authorizer
This interface allows CDAP admins to grant/revoke permissions for specific operations on specific CDAP entities to specified Principals. It will be used by the ACL Management module, which may or may not reside in CDAP for the purposes of integration with Apache Sentry TBD.
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
interface Authorizer extends AuthEnforcer { /** * Grants a principal authorization to perform a set of actions on an entity. * * @param entity the entity on which an action is being performed * @param principal the Principal that performs the actions. This could be a user, group or a role * @param actions the set of actions to grant */ void grant(EntityId entity, Principal principal, Set<Action> actions); /** * Grants a Principal authorization to perform all actions on an entity. * * @param entity the entity on which an action is being performed {@link EntityId}. * * @param entity the {@link EntityId} on which all {@link Action actions} are to be revoked */ void revoke(EntityId entity) throws Exception; /** * Returns all the {@link Privilege} for the specified {@link Principal}. * * @param principal the {@link Principal} for which to return privileges * @return a {@link Set} of {@link Privilege} for the specified principal */ Set<Privilege> listPrivileges(Principal principal) throws Exception; /********************************* Role Management: APIs for Role Based Access Control ******************************/ /** * Create a role. * * @param role the {@link Role} to create * @throws RoleAlreadyExistsException if the the role to be created already exists */ void createRole(Role role) throws Exception; /** * Drop a role. * * @param principalrole the Principal that performs the actions. This could be a user, group or a role {@link Role} to drop * @throws RoleNotFoundException if the role to be dropped is not found */ void grantdropRole(EntityIdRole entity,role) Principalthrows principal)Exception; /** * Add a role to the specified {@link Principal}. * Revokes a principal's authorization* to@param performrole athe set{@link ofRole} actionsto onadd anto entity.the specified group * @param principal the {@link Principal} *to @param entityadd the entityrole onto which an action is* being@throws performedRoleNotFoundException if the role to be *added @param principalto the principalprincipals thatis performs the actions. This could be a user, group or a role not found */ void addRoleToPrincipal(Role role, Principal principal) throws Exception; /** @param actions the set* ofDelete actionsa torole revokefrom permissionsthe onspecified {@link Principal}. */ * void@param revoke(EntityId entity, Principal principal, Set<Action> actions); /** role the {@link Role} to remove from the specified group * Revokes a@param principal's authorizationthe to{@link performPrincipal} anyto actionremove onthe anrole entity.from * @throws *RoleNotFoundException if the role to be *removed @paramto entitythe theprincipals entityis onnot whichfound an action is being*/ performed void removeRoleFromPrincipal(Role role, Principal *principal) @paramthrows principalException; the principal that performs/** the actions. This could* beReturns a user, group or a role */ void revoke(EntityId entity, Principal principal); /** set of all {@link Role roles} for the specified {@link Principal}. * * @param principal the {@link Principal} to look up roles for * Revokes@return allSet principals'of authorization{@link toRole} perform any action on an entity. for the specified {@link Principal} */ Set<Role> listRoles(Principal principal) *throws @paramException; entity the entity on/** which an action is* beingReturns performedall available {@link Role}. Only a */super user can perform void revoke(EntityId entity); } |
Where Principal
is the entity performing actions defined as below:
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
public class Principal {
enum PrincipalType {
USER,
GROUP,
ROLE
}
private final String name;
private final PrincipalType type;
public Principal(String name, PrincipalType type) {
this.name = name;
this.type = type;
}
public String getName() {
return name;
}
public PrincipalType getType() {
return type;
}
} |
Integration with Apache Sentry will be achieved by implementations of these interfaces that delegate to Apache Sentry.
Integration with Apache Sentry
Integration with Apache Sentry involves the development of three main modules:
CDAP Sentry Binding
Here we will bind CDAP to SentryGenericServiceClient and to the operations on the client.
Code Block | ||||
---|---|---|---|---|
| ||||
public class SentryAuthorizer implements Authorizer {
void grant(EntityId entity, Principal Principal, Set<Action> actions){
// do grant operation on sentry client with needed mapping/conversion
}
...
...
private SentryGenericServiceClient getClient() throws Exception {
return SentryGenericServiceClientFactory.create(conf); // create sentry client from Configuration
}
} |
CDAP Sentry Model
The CDAP Sentry Model defines the CDAP entities for whom access needs to be authorized via Apache Sentry. It will based off of the Sentry Generic Authorization Model. The CDAP Sentry Model will have the following components:
CDAPAuthorizable
This interface defines the CDAP entities that need to be authorized. It must implement Authorizable.
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
/**
* This interface represents an authorizable resource in the CDAP component.
*/
public interface CDAPAuthorizable extends Authorizable {
public enum AuthorizableType {
Instance,
Namespace,
Artifact,
Application,
Program,
Dataset,
Stream,
};
AuthorizableType getAuthzType();
} |
The CDAPAuthorizable
interface will have to be implemented for each authorizable entity defined by the AuthorizableType
enum above.
CDAPAction and CDAPActionFactory
These classes will implement BitFieldAction and BitFieldActionFactory to define the types of actions on CDAP entities. These classes also allow you to define implies relationships between actions.
...
this operation.
*
* @return a set of all available {@link Role} in the system.
*/
Set<Role> listAllRoles() throws Exception;
/**
* Destroys an {@link Authorizer}. Authorization extensions can use this method to write any cleanup code.
*/
void destroy() throws Exception;
} |
Where Principal
is the entity performing actions defined as below:
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
public class Principal {
enum PrincipalType {
USER,
GROUP,
ROLE
}
private final String name;
private final PrincipalType type;
public Principal(String name, PrincipalType type) {
this.name = name;
this.type = type;
}
public String getName() {
return name;
}
public PrincipalType getType() {
return type;
}
} |
Integration with Apache Sentry will be achieved by implementations of these interfaces that delegate to Apache Sentry.
Integration with Apache Sentry
Integration with Apache Sentry involves the development of three main modules:
CDAP Sentry Binding
Here we will bind CDAP to SentryGenericServiceClient and to the operations on the client.
Code Block | ||||
---|---|---|---|---|
| ||||
public class SentryAuthorizer implements Authorizer {
void grant(EntityId entity, Principal Principal, Set<Action> actions){
// do grant operation on sentry client with needed mapping/conversion
}
...
...
private SentryGenericServiceClient getClient() throws Exception {
return SentryGenericServiceClientFactory.create(conf); // create sentry client from Configuration
}
} |
CDAP Sentry Model
The CDAP Sentry Model defines the CDAP entities for whom access needs to be authorized via Apache Sentry. It will based off of the Sentry Generic Authorization Model. The CDAP Sentry Model will have the following components:
CDAPAuthorizable
This interface defines the CDAP entities that need to be authorized. It must implement Authorizable.
...
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
public class CDAPActionConstants {
public static final String READ = "read";
public static final String EXECUTE = "execute";
public static final String WRITE = "write";
public static final String ADMIN = "admin"; // this is read + write + execute + admin (create/update/delete)
} |
Sentry Policy Engine
Resource URIs
Using the above authorizable model, resource URIs for CDAP entities in the Sentry Policy Engine will be as follows:
Entity | Sentry Resource URI |
---|---|
Instance | cdap:///instance=server1 |
Namespace | cdap:///instance=server1/namespace=ns1 |
Artifact | cdap:///instance=server1/namespace=ns1/artifact=art1 |
Application |
|
Program | cdap:///instance=server1/namespace=ns1/application=app1/programType=pt1/programName=prg1 |
Dataset | cdap:///instance=server1/namespace=ns1/dataset=ds1 |
Stream | cdap:///instance=server1/namespace=ns1/stream=s1 |
Note |
---|
The above URIs are internal Apache Sentry representations defined at SentryAuthorizationModelDesign. They are only mentioned here to convey how the CDAP entity hierarchy will be represented in Apache Sentry. |
Interaction Diagram
Use-case: App Deployment by an unauthorized user
Configuration
Sentry
Property | Description | Value |
---|---|---|
sentry.service.allow.connect | List of users allowed to connect to the Sentry Server | cdap will be added to this list |
sentry.cdap.provider | Authorization provider for the CDAP component in Sentry. This class defines the user-group mapping amongst other things. | org.apache.sentry.provider.common. HadoopGroupResourceAuthorizationProvider |
sentry.cdap.provider.resource | The resource for creating the Sentry Provider Backend. This property seems unused, and always defaults to "". However, all data engines (hive, sqoop, kafka define it). | "" |
sentry.cdap.provider.backend | A class that implements ProviderBackend . This class uses a SentryServiceClient to communicate with the sentry service from the client side in Sentry. | org.apache.sentry.provider.db.generic.SentryGenericProviderBackend |
sentry.cdap.policy.engine | Defines the Sentry Policy Engine for the cdap component. Must implement org.apache.sentry.policy.common.PolicyEngine |
(package name subject to change) |
sentry.cdap.instance.name | Defines the instance name for the cdap component. | cdap |
CDAP
These properties will be defined in cdap-security.xml
Property | Description | Default |
---|---|---|
security.authorization.enabled | Determines whether authorization should be enabled in CDAP. If false, a NoOpAuthorizer would be used for security.authorizer.class | false |
security.authorizer.class | Fully qualified class name of the authorizer class. Must implement the Authorizer interface | co.cask.cdap.security.authorization.DatasetBasedAuthorizer |
Role Management
To support RBAC (Role Based Access Control) such as Apache Sentry we will need to support role management through CDAP.
A user using RBAC should be able to:
- Create a role
- delete a role
- add role to principal (where principal can be of type user or group)
- remove role from a principal (where principal can be of type user or group)
- List roles
- List roles for principal
- List privileges for role
We will need to support this operation from through REST APIs and also through cli. Below is the proposed APIs and CLI commands:
Operation | REST API | Body | Response | CLI Command (from Security CLI commands) | |||||
---|---|---|---|---|---|---|---|---|---|
create role | PUT /security/authorization/roles/<role-name> | N/A | 200: Created the role 409: role already exists | create role <role-name> | |||||
delete role | DELETE /security/authorization/roles/<role-name> | N/A | 200: Deleted the role 404: role is not found | drop role <role-name> | |||||
add role to principal | PUT /security/authorization/<principal-type>/<principal-name>/roles/<role-name> |
| 200: Added role to principal 404: role not found 404: principal not found | add role <role-name> to group/user <group/user-name> | |||||
remove role from principal | DELETE /security/authorization/<principal-type>/<principal-name>/roles/<role-name> |
| 200: removed role from principal 404: role not found 404: principal not found | remove role <role-name> from group/user <group/user-name> | |||||
List roles | GET /security/authorization/roles/ | N/A | 200: List of roles
| list roles | |||||
List roles for principal | GET /security/authorization/<principal-type>/<principal-name>/roles | N/A | 200: List of roles
404: Principal not found | list roles for group/user <group/user-name> | |||||
List privileges for role | GET /security/authorization/roles/<role-name>/privileges | N/A | 200: List of privileges for the role
404: role not found | list privileges for role <role-name> |
| ||
/**
* This interface represents an authorizable resource in the CDAP component.
*/
public interface CDAPAuthorizable extends Authorizable {
public enum AuthorizableType {
Instance,
Namespace,
Artifact,
Application,
Program,
Dataset,
Stream,
};
AuthorizableType getAuthzType();
} |
The CDAPAuthorizable
interface will have to be implemented for each authorizable entity defined by the AuthorizableType
enum above.
CDAPAction and CDAPActionFactory
These classes will implement BitFieldAction and BitFieldActionFactory to define the types of actions on CDAP entities. These classes also allow you to define implies relationships between actions.
TODO: Think about ALL, ADMIN_ALL
Code Block | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
public class CDAPActionConstants {
public static final String READ = "read";
public static final String EXECUTE = "execute";
public static final String WRITE = "write";
public static final String ADMIN = "admin"; // this is read + write + execute + admin (create/update/delete)
} |
Sentry Policy Engine
Resource URIs
Using the above authorizable model, resource URIs for CDAP entities in the Sentry Policy Engine will be as follows:
Entity | Sentry Resource URI |
---|---|
Instance | cdap:///instance=server1 |
Namespace | cdap:///instance=server1/namespace=ns1 |
Artifact | cdap:///instance=server1/namespace=ns1/artifact=art/artifactVersion=1 |
Application |
|
Program | cdap:///instance=server1/namespace=ns1/application=app1/programType=pt1/programName=prg1 |
Dataset | cdap:///instance=server1/namespace=ns1/dataset=ds1 |
Stream | cdap:///instance=server1/namespace=ns1/stream=s1 |
Note |
---|
The above URIs are internal Apache Sentry representations defined at SentryAuthorizationModelDesign. They are only mentioned here to convey how the CDAP entity hierarchy will be represented in Apache Sentry. |
Interaction Diagram
Use-case: App Deployment by an unauthorized user
Configuration
Sentry
Property | Description | Value |
---|---|---|
sentry.service.allow.connect | List of users allowed to connect to the Sentry Server | cdap will be added to this list |
sentry.cdap.provider | Authorization provider for the CDAP component in Sentry. This class defines the user-group mapping amongst other things. | org.apache.sentry.provider.common. HadoopGroupResourceAuthorizationProvider |
sentry.cdap.provider.resource | The resource for creating the Sentry Provider Backend. This property seems unused, and always defaults to "". However, all data engines (hive, sqoop, kafka define it). | "" |
sentry.cdap.provider.backend | A class that implements ProviderBackend . This class uses a SentryServiceClient to communicate with the sentry service from the client side in Sentry. | org.apache.sentry.provider.db.generic.SentryGenericProviderBackend |
sentry.cdap.policy.engine | Defines the Sentry Policy Engine for the cdap component. Must implement org.apache.sentry.policy.common.PolicyEngine |
(package name subject to change) |
CDAP
These properties will be defined in cdap-security.xml
Property | Description | Default |
---|---|---|
security.authorization.enabled | Determines whether authorization should be enabled in CDAP. If false, a NoOpAuthorizer would be used for security.authorizer.class | false |
security.authorizer.class | Fully qualified class name of the authorizer class. Must implement the Authorizer interface | co.cask.cdap.security.authorization.DatasetBasedAuthorizer |
instance.name | Defines the instance name for the cdap component. | cdap |
Role Management
To support RBAC (Role Based Access Control) such as Apache Sentry we will need to support role management through CDAP.
A user using RBAC should be able to:
- Create a role
- delete a role
- add role to principal (where principal can be of type user or group)
- remove role from a principal (where principal can be of type user or group)
- List roles
- List roles for principal
- List privileges for role
We will need to support this operation from through REST APIs and also through cli. Below is the proposed APIs and CLI commands:
ACL management
There are multiple options for ACL Management. For dataset-based authorizer, we will have to support ACL Management via the CDAP CLI.
...
Although supporting the Sentry Shell seems straightforward once the CDAP backend for Sentry is implemented, it's a relatively new feature added in Sentry 1.7 (SENTRY-749). CDH 5.5 ships Sentry 1.5 and there are no timelines on support for Sentry 1.7 (Cloudera Maven Repository).CDH 5.5 ships Sentry 1.5 and there are no timelines on support for Sentry 1.7 (Cloudera Maven Repository).
After some digging we found out that SentryShell is hardcoded to use work with Hive and it works only with Hive. At the moment of this writing, Kafka is added support for SentryShell by making a copy for Hive's SentryShell. This seems to be the norm in Sentry for Shell support since there is no generic Shell which can be used by the services being integrated to Sentry. Unless we have some strong reason we should avoid having support for CDAP through SentryShell, specially since we are already working on supporting ACL management for CDAP in Sentry through Hue. See below.
For recognizing and listing CDAP entities in Hue, we will have to implement a CDAP Webapp for Hue. Hue is implemented entirely in Python using the Django framework. This integration is a risk for 3.4. More details on this TBD.
Testing
For testing the sentry integration, there are a couple of approaches. We can use the file-based policy store in Apache Sentry for tests. However, to simulate more realistic scenarios, we should explore if it is easy to setup an in-memory database (HSQL, etc) with the Sentry schema in tests.
...