Creating Hydrator Applications using CDAP System Artifacts

Source Code Repository: Source code (and other resources) for this guide are available at the CDAP Guides GitHub repository.

Using the built-in cdap-data-pipeline and cdap-data-streams system artifacts, you can create Cask Hydrator Pipelines (Hydrator Applications) with just a JSON configuration file. CDAP ships with a set of built-in Sources, Sinks, Transforms, and other plugins (described here) which can be used to create batch and real-time data pipeline applications right out of the box.

Note: If you want to create your own Source, Sink, or other plugin, you can find more instructions on how to do that here.

Note: Both the cdap-etl-batch and cdap-etl-realtime system artifacts have been deprecated as of CDAP 3.5.0 and replaced with the artifacts cdap-data-pipeline and cdap-data-streams respectively.

What You Will Create

  • Real-time JMS to Stream: In this application, we will read messages from a JMS producer in real time and write to a CDAP Stream.

  • Real-time Kafka to TPFS Avro: This application fetches messages from Kafka in real time and writes to Time-PartitionedFileSets in Avro format.

What You Will Need

Let's Begin!

For these guides, we will use the CDAP CLI to create and manage Hydrator Applications. The CLI commands assume that the cdap-cli.sh script is available on your PATH. If this is not the case, please add it:

$ export PATH=$PATH:<CDAP home>/bin

or, from within the <CDAP home> directory:

$ export PATH=${PATH}:`pwd`/bin

If you haven't already started a standalone CDAP installation, start it with the command:

$ cdap.sh start

Now navigate to the Hydrator Application that you want to create and you will find further instructions on how to create that specific application.

Share and Discuss!

Have a question? Discuss at the CDAP User Mailing List.

License

Copyright © 2015-2016 Cask Data, Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.