Tracker TruthMeter Function Design

Truth Score depends on three metrics:
1) Percentage of Audit Log Messages (40% of the score)

This is (Audit Log Messages for a Dataset / Total Audit Log Messages). Programs reads are omitted to avoid redundancy.

2) Percentage of Unique Programs Read (40% of the score)

This is (Total Unique Programs reading from dataset / Total programs present)

3) Time Since Last Read (20% of the score, by rank of the most recent read among most recent reads for all datasets)

If there are 10 datasets, they are sorted based on time since the last time the dataset was read. The dataset that was read the most recently gets 10/10 * 20 = 20 points, as its ranked first. The second most recently read dataset receives a score of 9/10 * 20, the third most gets 8/10 * 20, and so on. As time since the last read can vary from never to 0 to a very large number, a relative score seems necessary.

Sample Output 1

Dataset	% of Audit Log Messages	% of unique programs	Time Since Last Read	Score
DS1	70	60	80s	72
DS2	30	50	1000s	42

Calculation Example:

DS1: 70% of the Audit Log Messages are for DS1. 70*40/100 = 28 (40% of the score)

60% of the programs access DS1: 60 * 40/100 = 24 (40% of the score)

Among the two datasets, DS1 has been accessed the most recently, so 2/2 * 20 = 20 (20% of the score)

Total: 72

DS2: 30 * .4 + 50 * .4 + 1/2 * 20 = 42

Sample Output 2

Dataset	% of Audit Log Message	% of unique programs	Time Since Last Read	Score
DS1	25	30	10000s	27
DS2	25	30	9000s	32
DS3	25	30	800s	37
DS4	25	30	70s	42

Calculation Example:

DS1:

25 * 0.4 + 30 * 0.4 + 1/4 * 20 = 27

Sample Output 3

Dataset	% of Audit Log Message	% of unique programs	Time Since Last Read	Score
DS1	65	80	10s	78
DS2	25	40	20s	41
DS3	10	40	40s	25
DS4	5	10	30s	16

Problems with the design

Scores go down (on average) as number of datasets that are tracked increases (example: sample output 2)
Most scores are on the lower end. Even dataset that look popular on paper have a score of around 65-80.