This document provides a step-by-step guide to viewing Cloud Data Fusion pipeline logs in Stackdriver.
Overview
Cloud Data Fusion integrates with Stackdriver Logging, which allows you to collect and view your Cloud Data Fusion pipeline logs. Cloud Data Fusion’s Stackdriver logging contains logs from various resources such as YARN and HDFS from the Dataproc cluster. Viewing these logs can provide insight to the lifecycle and resource usage of your pipeline on the Dataproc cluster, which can help you debug and fine-tune your pipeline.
Enabling Stackdriver Logging
To use Stackdriver Logging with your Cloud Data Fusion pipeline, you need to enable Stackdriver Logging when you create your Cloud Data Fusion instance.
If Stackdriver Logging was not enabled during instance creation. It can be enabled using the gcloud command, example:
gcloud beta data-fusion instances update $INSTANCE_NAME \
--project=$PROJECT \
--location=$LOCATION \
--enable_stackdriver_logging
In the GCP Console Data Fusion Instances page, click Create instance.
Click Show advanced options.
Under Logging and monitoring, click Enable Stackdriver logging service.
Viewing logs in Stackdriver
Every Cloud Data Fusion pipeline run is assigned a unique RunID
. After you deploy and run your pipeline, find the RunID
of the pipeline run you would like to view logs for. In Stackdriver, view the logs for the pipeline run with that RunID
.
Getting your pipeline RunID
After your pipeline has successfully run, click Summary.
In the Summary page, click the Table link.
Click on the
RunId
link to view and copy to clipboard.
Viewing logs
Go to the Stackdriver Logging > Logs Viewer page in the GCP Console.
Select
Cloud Dataproc Cluster > cdap-<pipeline-name>-<runId>
.
Filtering logs
You can use the filter options filter to filter the logs you see. You can filter by component such as datafusion-pipeline-logs
or yarn-resourcemanager
logs, or you can filter by various log levels. Use the drop-down menu to choose a filter.
Downloading logs
You can download logs from Stackdriver. Click the Download logs
option at the top.