...
- As a CDAP security admin, I want all operations on datasets/streams to be governed by my configured authorization system.
- As a CDAP security adminsystem, I want list operations for all CDAP entities to only return entities that the logged-in user is authorized to view.
- As a CDAP security adminsystem, I want view operations for a CDAP entity to only succeed if the logged-in user is authorized to view that entity
...
Since the sub-components of CDAP will now just use the authorization policy cache to check for ACLs, there would be a problem if the cache refresh continually keeps failing (let's say perhaps because the authorization backend is down). If such failures are continual and consistent over a period of time, it could result in the cache being stale over a long time. This could lead to serious security loopholes, and hence there should be a way to invalidate the cache when such consistent failures occur. This could be done by having a configurable retry limit for failures. When this limit is reached, the cache would be cleared, and until the next successful refresh, any operation in CDAP will result in an authorization failure. Although this would render CDAP in an unusable state, it will reduce the chances of such a security breach. In such a case, admins will have to fix the communication between CDAP and the authorization backend before CDAP can be used again.
Alternative Caching Approach
An alternative caching approach would be for the CDAP sub-components to query the cache for a privilege, and the cache to return if there is a hit, and go back to the authorization provider if there is a miss.
Pros:
- Can have individual privilege level cache expiry, making the refresh process more streamlined
- No need for an asynchronous cache refresh thread, that refreshes all policies (resulting in asynchronous, but longer refresh process)
Cons
- The major drawback of this approach seems like it is makes the majority access pattern potentially slow, because it requires a call to the authorization provider every time an privilege is not found in the cache.
Caching in Apache Sentry
Apache Sentry has some active work going on to enable client-side caching as part of SENTRY-1229. It will likely suffer from the same drawbacks mentioned above regarding cache freshness. There is a case for re-using this (and other such) caching from authorization providers in CDAP. However, we will choose to implement a cache in CDAP independently because of the following reasons:
...