Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Truth Score depends on three metrics:
1) % Percentage of Audit Log Messages (40% of the score).

This is (Audit Log Messages for a Dataset / Total Audit Log Messages). Programs reads are omitted to avoid redundancy.

 
2) % Percentage of Unique Programs Read (40% of the score).

This is (Total Unique Programs reading from dataset / Total programs present)

 
3) Time Since Last Read (20% of the score, by rank of the most recent read among most recent reads for all datasets)

     If there are 10 datasets, they are sorted based on time since the last time the dataset was read. The dataset that was read the most recently gets 10/10 * 20 = 20 points, as its ranked first. The second most recently read dataset receives a score of 9/10 * 20, the third most gets 8/10 * 20, and so on. As time since the last read can vary from never to 0 to a very large number, a relative score seems necessary.


   Sample Sample Output :1

Dataset% of Audit Log Messages% of unique programsTime Since Last ReadScore
DS1706080s72
DS230501000s42

 

Sample Output 2

Dataset% of Audit Log Message% of unique programsTime Since Last ReadScore
DS1253010000s27
DS225309000s32
DS32530800s37
DS4253070s42


Sample Output 3

Dataset% of Audit Log Message% of unique programsTime Since Last ReadScore
DS1658010s78
DS2254020s41
DS3104040s25
DS451030s16


Problems with the design

  • Scores go down (on average) as number of datasets that are tracked increases (example: sample output 2)
  • Most scores are on the lower end. Even dataset that look popular on paper have a score of around 65-80.