Page Comparison

The function will use the following three componentsTruth Score depends on three metrics:
1) Dataset Activity

If Total number of datasets tracked = N

Total audit log messages = M

Audit log messages for dataset 1 = m1

Then the Truth Value for component 1 only is:

log₁₀(m1/M - 100/N + 50) / log₁₀ (150)

Sample Output: Percentage of Audit Log Messages (40% of the score)

This is (Audit Log Messages for a Dataset / Total Audit Log Messages). Programs reads are omitted to avoid redundancy.

2) Percentage of Unique Programs Read (40% of the score)

This is (Total Unique Programs reading from dataset / Total programs present)

3) Time Since Last Read (20% of the score, by rank of the most recent read among most recent reads for all datasets)

Example: if there are 10 datasets, they are sorted based on time since the last time the dataset was read. The dataset that was read the most recently gets 10/10 * 20 = 20 (rank / total datasets * 20) points, as it's ranked first. The second most recently read dataset receives a score of 9/10 * 20, the third most gets 8/10 * 20, and so on. As time since the last read can vary from never to 0 to a very large number, a relative score seems necessary.

Sample Output 1

Dataset	% of Audit Log MessageMessages	% of unique programs	Time Since Last Read	Score
DS1	70	85%60	80s	72
DS2	30	50	68%

Dataset	% of Audit Log Message	Score
DS1	25	78
DS2	25	78
DS3	25	78
DS4	25	78

1000s

42

Calculation Example:

DS1: 70% of the Audit Log Messages are for DS1. 70*40/100 = 28 (40% of the score)

60% of the programs access DS1: 60 * 40/100 = 24 (40% of the score)

Among the two datasets, DS1 has been accessed the most recently, so 2/2 * 20 = 20 (20% of the score)

Total: 72

DS2: 30 * .4 + 50 * .4 + 1/2 * 20 = 42

Sample Output 2

89

Dataset	% of Audit Log Message	% of unique programs	Time Since Last Read	Score
DS1	65	25	30	10000s	27
DS2	25	7830	9000s	32
DS3	25	1030	800s	7037
DS4	5	67	25	30	70s	42

Calculation Example:

DS1:

25 * 0.4 + 30 * 0.4 + 1/4 * 20 = 27

Sample Output 3

...

85

Dataset	% of Audit Log Message	% of unique programs	Time Since Last Read	Score
DS1	40	65	80	10s	78
DS2	32	8325	40	20s	41
DS3	1510	7740	DS440s	725
73DS4	DS55	310	7130s
DS6	3	71

2) Number of unique programs reading from a dataset

16

Problems with the design

Scores go down (on average) as number of datasets that are tracked increases (example: sample output 2)
Most scores are on the lower end. Even dataset that look popular on paper have a score of around 65-80.

Versions Compared

Old Version 1

New Version Current

Key

Calculation Example:

Calculation Example: