The goal of this page is to document the design of the Tracker Audit Metrics.
Use-Cases
- As a user of Tracker, I would like to see total number of audit messages by type/subtype in the past T timeframe.
- "Show me the total number of reads in the system in the past 1 hour."
- As a user of Tracker, I would like to see the top N datasets/streams by audit message type/subtype activity in the past T timeframe.
- "Show me the 5 datasets with the most writes in the past 24 hours."
- "Show me the 5 streams with the most metadata_changes in the past 7 days."
- As a user of Tracker, I would like to see the top N namespaces with the most type/subtype activity in the past T timeframe.
- "Show me the 5 namespaces witht he most reads in the past 1 hour."
Initial High Level Plan
- As messages come from the Kafka broker and are written to the AuditLog Table, when a message matches one of the metrics criteria, update metrics in a separate OLAP Cube (but the same dataset) as required.
- In the service layer, expose a new endpoint that allows users to query the data in the metrics table and returns the results in JSON.
Storing Metrics in AuditLog Dataset
- Add an additional Cube table to the AuditLog custom dataset to hold metrics.
- The properties of the cube will be as follows
- Resolutions: 1h, 1d, 7d, 30d
- Aggs:
- access_type (access create update truncate delete metadata_change)
- access_type,subtype (access,read access,write access,unknown)
- namespace (default ns1 ns2)
- namespace,entity_type,entity_name (default,stream,stream1 default,dataset,dataset1)
- Measurements:
- accesses
- access_reads
- access_writes
- access_unknowns
- creates
- updates
- truncates
- deletes
- metadata_changes