Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The purpose of this page is to document the plan for redesigning the Github statistic collection in the Caskalytics app.

...

Current Implementation of Github Metrics

  • API: https://developer.github.com/v3/

  • Use a Workflow Custom Action to run periodic RESTful calls to the Github API

  • Results will be written into the GitHub partition of the Fileset.

  • A MapReduce job will periodically read from the GitHub partition of the Fileset, and update the Cube dataset.

...

  • Expose a service that accepts and verifies valid webhook messages from Github and writes those messages to a Datatable.
    • This will collect both the raw messages as well as a metrics table for collecting stats at a repo and user level
  • Expose a RESTful endpoint to query metrics from the aggregates table and return results in JSON
  • Use the data service to create some sort of visual display of the information.

Metrics Calculated

  • Metrics will be stored in a seperate dataset from the raw messages
  • Repo messages will overwrite each time a new message is recieved from Github
  • Per User metrics will be incremented 
  • All Time Metrics
    • Per Repository
      • repo size
      • stargazers_count
      • watchers_count
      • forks_count
    • Per User
      • Issues <action> (opened, closed, reopened)
      • Issue Comment Created

Capture Endpoint

  • The capture endpoint will be a catch all endpoint where the