Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This guide will take you through building a simple CDAP application that ingests web logs, aggregates the request counts for different combinations of fields, and that can then be queried for the volume over a time period. You can then retrieve insights on the traffic of a web site and the web site’s health. You will:

  • Use a Stream to ingest real-time log data;

  • Build a Workflow to process log entries as they are received into multidimensional facts;

  • Use a Dataset to store the aggregated numbers; and

  • Build a Service to query the aggregated data across multiple dimensions.

...

Let’s Build It!

The following sections will guide you through building an application from scratch. If you are interested in deploying and running the application right away, you can clone its source code from this GitHub repository. In that case, feel free to skip the next two sections and jump right to the Build and Run Application section.

...

First, we need a place to receive and process the events. CDAP provides a reala real-time stream processing system that system that is a great match for handling event streams. After first setting the application name, our WebAnalyticsApp adds a new Streamnew Stream.

Then, the application configures a Cube dataset to compute and store aggregations for combinations of dimensions. Let’s take a closer look at the properties that are used to configure the Cube dataset:

...

Code Block
[
    {
        "measureName": "count",
        "dimensionValues": {},
        "timeValues": [
            {
                "timestamp": 1423375200,
                "value": 3
            },
            {
                "timestamp": 1423389600,
                "value": 1
            }
        ]
    }
]

Share and Discuss!

Have a question? Discuss at the CDAP User Mailing List.

License

Copyright © 2015 Cask Data, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

...