Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The purpose of this page is to document the plan for redesigning the Github statistic collection in the Caskalytics app.

...

Current Implementation of Github Metrics

  • API: https://developer.github.com/v3/

  • Use a Workflow Custom Action to run periodic RESTful calls to the Github API

  • Results will be written into the GitHub partition of the Fileset.

  • A MapReduce job will periodically read from the GitHub partition of the Fileset, and update the Cube dataset.

Use Cases

  • As a user of Caskalytics, I would like to store and retrieve all activity associated with my Github organization.
  • As a user of Caskalytics, I would like to view metrics for my Github repositories including forks, pull requests, watchers, stargazers and open issues.
  • As a user of Caskalytics, I would like to view metrics about the members of my organization such as number of issues opened, number of pull requests created.
  • As a user of Caskalytics, I would like to view a histogram of metrics about my repositories.

New implementation of Github Metrics

...

  • Metrics will be stored in a seperate dataset from the raw messages
  • Repo messages will overwrite each time a new message is received from Github
  • All Time Metrics
    • Per Repository
      • repo size
      • stargazers_count
      • watchers_count
      • forks_count
      • total pull requests
    • Per Message
      • count
    • Per Repo / Per Message
      • count
    • Per Sender

Capture Endpoint

  • The capture endpoint will be a catch all endpoint that accepts POST messages from Github, verifies their authenticity, and writes the message to the data store.
  • Each message should have the following headers to be considered "valid"
    • User-Agent should start with GitHub-Hookshot/<id>
    • X-GitHub-Delivery should be a UUID for the message
    • X-GitHub-Event should be the name of the message
    • X-Hub-Signature should contain an sha1 digest of the message for verification
    • payload should be the json message
  • If any required headers are missing or invalid, the response will be UNAUTHORIZED with a message stating that they are not authorized to call the service.
  • If the Event is missing, a BAD_REQUEST is returned.
  • If there is no payload, a BAD_REQUEST is returned
  • if the payload digest does not match the one provided in the Signature header  or there is an error generating it, a BAD_REQUEST is returned
  • When everything is successful, an OK is returned with a message that it was successfully processed

...