This document provide a step-by-step guide of how to view Cloud Data Fusion pipeline logs in Stackdriver.
Overview
Cloud Data Fusion integrates with Stackdriver Logging, which allows customer you to collect and view their your Cloud Data Fusion pipeline logs in Stackdriver. This is an optional feature which can be used if you use Stackdriver as a central log collection and viewer. Cloud Data Fusion’s Stackdriver logging also contains logs from various resources such as Yarn , and HDFS etc from the Dataproc cluster which can be very useful in getting a better visibility into . Viewing these logs can provide insight to the lifecycle and resource usage of your pipeline on the Dataproc cluster for debugging , which can help you debug and fine tuning. This document provide a step-by-step guide of how to view Cloud Data Fusion pipeline logs in Stackdriver.
Instructions
Creating an Instance with Stackdriver Logging
You must enable Stackdriver logging during -tune your pipeline.
Enabling Stackdriver Logging
To use Stackdriver Logging with your Cloud Data Fusion pipeline, you need to enable Stackdriver Logging when you create your Cloud Data Fusion instance creation. This can be done by selecting the .
In the GCP Console Data Fusion Instances page, click Create instance.
Click Show advanced options.
Under Logging and monitoring, click Enable Stackdriver logging service
...
.
...
Note |
---|
Cloud Data Fusion Beta supports enabling/disabling Stackdriver logging Logging only during instance creation. Once an instance is created Stackdriver logging option cannot be updated. |
Running Pipeline and RunId
...
After you create a Cloud Data Fusion instance, you can’t update the Stackdriver Logging option. |
Viewing logs in Stackdriver
Every Cloud Data Fusion pipeline run is assigned an a unique runId. You can RunID
. After you deploy and run your pipeline, find the runId RunID
of a pipeline run in Summary
section of the run.
...
In Summary page click on the pipeline run you would like to view logs for. In Stackdriver, view the logs for the pipeline run with that RunID
.
Obtain your pipeline RunID
After your pipeline has successfully run, click Summary.
In the Summary page, click the Table link.
...
Click on the
RunId
link
...
to view and copy to clipboard.
Viewing
...
, filtering, and downloading your logs
In the Stackdriver UI, in the Logs View page, select Cloud Dataproc Cluster → cdap-<pipeline-name>-<runId>
.
...
You can also filter the logs through use the filter options at top to just look at the filter to filter the logs you see. You can filter by component such as datafusion-pipeline-logs
or yarn-resourcemanager
logs, or other components logs. Additionally you can also filter the logs at by various log levels.
...
...
To You can download logs from Stackdriver just click . Click the Download logs
option at the top.
...