Objective
Publish all changes done to entities so that other apps/tools like Cask Tracker, MDM, etc can use this as a source for audit information.
Use Cases
Use cases and user stories are documented at Cask Tracker (formerly Cask Finder).
Design Choices
We chose Kafka to be the system where audit information gets published from CDAP. Other tools can subscribe to the Kafka feed to get audit information.
However, publishing to Kafka has certain drawbacks today that will need to be addressed later -
- Kafka publish does not happen in a transaction, so there is a chance that the audit log feed from Kafka may be inconsistent compared to what actually happened. CDAP-5109 has more discussion on it.
- There is no access control on who can publish audit information to Kafka (CDAP-5130).
Audit Message Format
Types of Audit Message
The following types of audit messages are published for an entity -
- CREATE
- UPDATE
- TRUNCATE
- DELETE
- ACCESS_READ
- ACCESS_WRITE
- ACCESS_READ_WRITE
- METADATA_CHANGE
[ { "time": 1456956659469, "entityId": { "namespace": "ns1", "dataset": "ds1", "entity": "DATASET" }, "user": "cdap", "type": "METADATA_CHANGE", "change": { "additions": [ { "scope": "USER", "properties": { "key1": "value1" }, "tags": [ "tag1" ] } ], "deletions": [ { "scope": "SYSTEM", "properties": {}, "tags": [ "tag2" ] } ] } }, { "time": 1456956659470, "entityId": { "namespace": "ns1", "dataset": "ds1", "entity": "DATASET" }, "user": "cdap", "type": "CREATE" }, { "time": 1456956659471, "entityId": { "namespace": "ns1", "dataset": "ds1", "entity": "DATASET" }, "user": "cdap", "type": "ACCESS", "access": { "type": "READ", "entityId": { "namespace": "ns1", "application": "app1", "type": "Flow", "program": "flow1", "entity": "PROGRAM" } } } ]
Implementation