Hue Integration

 

 

Goals

  1. Explore CDAP Entities in Hue

  2. Use Hue's admin interface to manage ACL for CDAP stored in Apache Sentry

Checklist

  • User stories documented (Shenggu)
  • User stories reviewed (Nitin)
  • Design documented (Shenggu)
  • Design reviewed (Andreas)
  • Feature merged (Shenggu)
  • Integration tests (Shenggu)
  • Documentation for feature (Shenggu)
  • Blog post (Shenggu)

User Stories

  • As a Hue admin, I should be able to easily configure CDAP as a plugin app in the Hue system
  • As a CDAP user or a CDAPadmin, I should be able to explore the entities of CDAP (Namespace->Application->Program(->subprogram), Namespace->Stream/Dataset/Aritifacts) in Cloudera Hue's UI.
  • As a CDAP user, I should be able to perform all the ACL management operations provided by Apache Sentry through Cloudera Hue's admin UI.
    • CDAP superusers can manage all the rules
    • A user/groups who have ADMIN on one entity can give ACL on that entity to other users/groups

Scenerios

#Scenario 1

A user (typically a CDH user) is using Hue for exploring and managing ACL and other operations for all the different services on their cluster. He would prefer to use Hue and the consistent UI to manage ACLs for CDAP from a central place rather than separately in CDAP UI. 

Design

This integration code to be implemented will be part of the Cloudera Hue and communicate to CDAP & Apache Sentry through Rest/Thrift to manage the ACLs. The Hue/app itself does not store any state during this process.

 

Brief Introduction of Cloudera Hue

 

 (from hue's doc http://www.cloudera.com/documentation/archive/cdh/4-x/4-2-0/Hue-2-User-Guide/hue2.html)

 

| Hue is a set of web applications that enable you to interact with a CDH cluster. Hue applications let you browse HDFS and work with Hive and Cloudera Impala queries, MapReduce jobs, and Oozie workflows.

The Hue server part is written in python Django framework and different systems, say Hbase or Impala, are configured as separate apps in Django. The users are able to control these components on the cluster through the web interface. And it is also possible to add customized apps to Hue server to provide support for additional system.

Logic view of the system

There are two possible designs for the system. 

Design 1:

 

Design 2:

 

As shown in both of the above diagram, the CDAP and SENTRY support are configured as a plugin app installed in the Hue system. Hue's front system is implemented in Django, which provides good isolation and extension for multiple apps running together in a web service. A separate panel section will be created in the Hue's default UI for related operations. This app will communicate with the CDAP system through CDAP's restful api service. All the live entities will be displayed in Hue's UI.

 

Communication with Apache SENTRY is enabled by SENTRY's thrift service. When admin grants/ revokes certain privileges through the Hue UI, it will be propagated to the SENTRY system and take effects on the further request coming from CDAP. In design one Hue will talk to the Sentry directly while design two take advantage of the Sentry Client apis built in CDAP to do so. Although the second design involves less code to be implemented, we will still implement design one as it is compatible with the behaviors of other plugins(hive/hdfs) in Hue and it is suitable for more cases(a security breach for instance). To work on design one, the Hue itself will also talk to sentry and have a separate keytab file to get authenticated with kerberos. 

UI Mockup

One possible UI layout is shown below. All the entities in CDAP can be listed hierarchically in the left. When click on one specific entity, user is able to view the detailed properties of this entity and manage the acl rules associated with this entity. The actual UI may vary in colors and relative layout of elements but stick to this concept.

Here are some other possible UI designs. Basically the ideas behind are the same that we provide a hierarchy entity structure to user with either a separate panel or a pop-up window to manage the ACLs.

We can make the addition of the ACLs as a pop up window to get focused.

 

In this case, the entire ACL management buttons are presented in the pop up window. The descriptions of entities can be displayed right to the entity name or displayed as anchors when mouse hovers over it.

 

Among all the UI layouts, we prefer the to implement the first one, since displaying all the UI components on the same page invloves less window open/close logic and less confusing to end users. In addtion, as the description of each entitiy is generally not that long (less than 5 entries in the first layer) and thus it is possible to put the ACL-adding-panel right under the descriptions. 

Configuration

To configure the CDAP app in HUE, simply copy the cdap app source code into $HUE_ROOT and run commands below: 

$HUE_ROOT/tools/app_reg/app_reg.py --install cdap --relative-paths

and the setup script will automatically add all required fields into hue's configuration file.

 

Note: May move some customized settings into HUE's configuration (located in $HUE_ROOT/desktop/config.dist/hue.ini) when project moves on, i.e. root host address of CDAP's rest api etc.

Currently no specific configuration is required in CDAP side.

Routes

This section explain the routes defined in Hue's CDAP app. In Django (as Hue is written in Django), routes is named as urls.py that use regex to define the format. MAKO is used as the html template engine.

URLResponse
GET /cdap/index.mako (main page)
GET /cdap/details/path/to/entity/entity_idjson of entity properties
GET /cdap/acl/path/to/entity/entity_idjson of entity ACLs
POST /cdap/acl/add/entity_id/ --data {groupid, operations}200 ok / 500 error
POST /cdap/acl/revoke/entity_id/ --data {groupid, operations}200 ok / 500 error
  
  
 

 

The operations here include {READ | WRITE | EXECUTE | ADMIN | ALL}. Multiple operations can be granted/revoked at once.

 

Out of Scope

In the above design, the system only supports listing all entities in CDAP and perform ACL management on these entities, while there is no full-support for managing the entities. These cases are listed as below and might be supported in the future.

  • Deploy/Start/Destroy a program
  • Creating/Deleting/Renaming an entity
  • List and Explore those entities that are not related to ACL management such as services, workflows
  • Change the properties of entities

 


 

Â