Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document provide a step-by-step guide of how to view Cloud Data Fusion pipeline logs in Stackdriver.

Overview

Cloud Data Fusion integrates with Stackdriver Logging, which allows customer you to collect and view their your Cloud Data Fusion pipeline logs in Stackdriver. This is an optional feature which can be used if you use Stackdriver as a central log collection and viewer. Cloud Data Fusion’s Stackdriver logging also contains logs from various resources such as Yarn , and HDFS etc from the Dataproc cluster which can be very useful in getting a better visibility into . Viewing these logs can provide insight to the lifecycle and resource usage of your pipeline on the Dataproc cluster for debugging , which can help you debug and fine tuning. This document provide a step-by-step guide of how to view Cloud Data Fusion pipeline logs in Stackdriver.

Instructions

Creating an Instance with Stackdriver Logging

You must enable Stackdriver logging during -tune your pipeline.

Enabling Stackdriver Logging

To use Stackdriver Logging with your Cloud Data Fusion pipeline, you need to enable Stackdriver Logging when you create your Cloud Data Fusion instance creation. This can be done by selecting the .

  1. In the GCP Console Data Fusion Instances page, click Create instance.

  2. Click Show advanced options.

  3. Under Logging and monitoring, click Enable Stackdriver logging service

...

  1. .

...

Note

Cloud Data Fusion Beta supports enabling/disabling Stackdriver logging Logging only during instance creation. Once an instance is created Stackdriver logging option cannot be updated.

Running Pipeline and RunId

...

After you create a Cloud Data Fusion instance, you can’t update the Stackdriver Logging option.

Viewing logs in Stackdriver

Every Cloud Data Fusion pipeline run is assigned an a unique runId. You can RunID. After you deploy and run your pipeline, find the runId RunID of a pipeline run in Summary section of the run.

...

In Summary page click on the pipeline run you would like to view logs for. In Stackdriver, view the logs for the pipeline run with that RunID.

Obtain your pipeline RunID

  1. After your pipeline has successfully run, click Summary.

    Image Added
  2. In the Summary page, click the Table link.

    Image Modified

...

  1. Click on the RunId link

...

  1. to view and copy to clipboard.

    Image Modified

Viewing

...

, filtering, and downloading your logs

In the Stackdriver UI, in the Logs View page, select Cloud Dataproc Cluster → cdap-<pipeline-name>-<runId>.

...

You can also filter the logs through use the filter options at top to just look at the filter to filter the logs you see. You can filter by component such as datafusion-pipeline-logs or yarn-resourcemanager logs, or other components logs. Additionally you can also filter the logs at by various log levels.

...

...

To You can download logs from Stackdriver just click . Click the Download logs option at the top.

...