...
This guide will take you through building a simple CDAP application that ingests web logs, aggregates the request counts for different combinations of fields, and that can then be queried for the volume over a time period. You can then retrieve insights on the traffic of a web site and the web site’s health. You will:
Use a Stream to ingest real-time log data;
Build a Workflow to process log entries as they are received into multidimensional facts;
Use a Dataset to store the aggregated numbers; and
Build a Service to query the aggregated data across multiple dimensions.
...
First, we need a place to receive and process the events. CDAP provides a reala real-time stream processing system that system that is a great match for handling event streams. After first setting the application name, our WebAnalyticsApp
adds a new Streamnew Stream.
Then, the application configures a Cube dataset to compute and store aggregations for combinations of dimensions. Let’s take a closer look at the properties that are used to configure the Cube dataset:
...
Code Block |
---|
[ { "measureName": "count", "dimensionValues": {}, "timeValues": [ { "timestamp": 1423375200, "value": 3 }, { "timestamp": 1423389600, "value": 1 } ] } ] |
Share and Discuss!
Have a question? Discuss at the CDAP User Mailing List.
License
Copyright © 2015 Cask Data, Inc.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
...