The purpose of this page is to document the plan for redesigning the Github statistic collection in the Caskalytics app.
...
Current Implementation of Github Metrics
Use a Workflow Custom Action to run periodic RESTful calls to the Github API
Results will be written into the GitHub partition of the Fileset.
A MapReduce job will periodically read from the GitHub partition of the Fileset, and update the Cube dataset.
Use Cases
- As a user of Caskalytics, I would like to store and retrieve all activity associated with my Github organization.
- As a user of Caskalytics, I would like to view metrics for my Github repositories including forks, pull requests, watchers, stargazers and open issues.
- As a user of Caskalytics, I would like to view metrics about the members of my organization such as number of issues opened, number of pull requests created.
- As a user of Caskalytics, I would like to view a histogram of metrics about my repositories.
New implementation of Github Metrics
...
- Metrics will be stored in a seperate dataset from the raw messages
- Repo messages will overwrite each time a new message is received from Github
- All Time Metrics
- Per Repository
- repo size
- stargazers_count
- watchers_count
- forks_count
- total pull requests
- Per Message
- count
- Per Repo / Per Message
- count
- Per Sender
- Per Repository
Capture Endpoint
- The capture endpoint will be a catch all endpoint that accepts POST messages from Github, verifies their authenticity, and writes the message to the data store.
- Each message should have the following headers to be considered "valid"
- User-Agent should start with GitHub-Hookshot/<id>
- X-GitHub-Delivery should be a UUID for the message
- X-GitHub-Event should be the name of the message
- X-Hub-Signature should contain an sha1 digest of the message for verification
- payload should be the json message
- If any required headers are missing or invalid, the response will be UNAUTHORIZED with a message stating that they are not authorized to call the service.
- If the Event is missing, a BAD_REQUEST is returned.
- If there is no payload, a BAD_REQUEST is returned
- if the payload digest does not match the one provided in the Signature header or there is an error generating it, a BAD_REQUEST is returned
- When everything is successful, an OK is returned with a message that it was successfully processed
...