Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • The Caskalytics should pass and store the following dimensions, some of which will be further processed to extract additional information.
    • IP Address
    • Visitor Id
    • Full Page Url
    • Page Title
    • Full Referring Url
    • User Agent String
    • Screen Resolution
    • Viewport Resolution
  • These secondary metrics can be extracted from the data stored above
    • Hostname (from full page url)
    • Path (from full page url)
    • Referring Source (from referral url)
    • Referring Path (from referral url)
    • Standard Campaign Parameters (from full page url)
      • Campaign
      • Source
      • Medium
      • Content
      • Keyword
    • OS and Browser data extracted (from User Agent)
      • OS
      • OS Version
      • Browser
      • Browser Version
    • Location based (from IP Address)
      • ISP (where the ip address is registered to)
      • Continent
      • Country
      • Region (or State)
      • City
      • Postal / Zip
      • Lat, Lon
  • Metrics
    • Pageviews
    • Unique pageviews
    • Sessions
    • Users
    • New Users
    • Pages / Session
    • Bounce Rate
    • Avg Session Duration

...

High Level Components

  • A minimized Javascript library and tracking snippet which can be added to any webpage and will collect and send data to a predefined Caskalytics endpoint.
  • A service that will handle the requests from the tracking code, store the required data, and return a 1x1.gif
  • A job that runs periodically that will process the new data written to the table and split the raw information into the secondary metrics. This job should also attempt to identify bot traffic.
  • A job that will run periodically to process additional calculated metrics such as pages per session and bounce rate
  • A service that exposes this data via a RESTful interface

Javascript Tracking Snippet

  • Responsible for constructing the request to the tracking beacon and inserting the img tag on the webpage
  • Gathers the following information from the browser using Javascript
    • full referrer url
    • full page url
    • cookie id. If no cookie present, one is created.
    • Page title (from the html title tag)
    • Screen and Viewport resolution
    • Generate cache buster
  • Data is url encoded and appended to a GET request to the server along with a version and cache buster
  • Configurable params
    • Endpoint url
    • Property Id - A unique string used to identify the property
  • Should not depend upon an external library for any of these metrics

Collection Service Endpoint

  • Responsible for collecting information from beacon request and storing that information in a data store
  • Additional metrics gathered in this service
    • Requester IP
    • User Agent String
    • Time of request in UTC
  • Data is written to raw data table using the the key of <full page url>-<user-id>-<timestampInMilliseconds>
  • Each piece of data is stored in it's own column

Dimension Splitter / Bot Filter Job

  

Metrics Calculator Job

RESTful Service