This document provide provides a step-by-step guide of how to view viewing Cloud Data Fusion pipeline logs in Stackdriver.
...
Cloud Data Fusion integrates with Stackdriver Logging, which allows you to collect and view your Cloud Data Fusion pipeline logs. Cloud Data Fusion’s Stackdriver logging contains logs from various resources such as Yarn YARN and HDFS from the Dataproc cluster. Viewing these logs can provide insight to the lifecycle and resource usage of your pipeline on the Dataproc cluster, which can help you debug and fine-tune your pipeline.
...
To use Stackdriver Logging with your Cloud Data Fusion pipeline, you need to enable Stackdriver Logging when you create your Cloud Data Fusion instance.
If Stackdriver Logging was not enabled during instance creation. It can be enabled using the gcloud command, example:
gcloud beta data-fusion instances update $INSTANCE_NAME \
--project=$PROJECT \
--location=$LOCATION \
--enable_stackdriver_logging
In the GCP Console Data Fusion Instances page, click Create instance.
Click Show advanced options.
Under Logging and monitoring, click Enable Stackdriver logging service.
...
Note |
---|
Cloud Data Fusion Beta supports enabling/disabling Stackdriver Logging only during instance creation. After you create a Cloud Data Fusion instance, you can’t update the Stackdriver Logging option. |
...
Viewing logs in Stackdriver
...
After your pipeline has successfully run, click Summary.
In the Summary page, click the Table link.
Click on the
RunId
link to view and copy to clipboard.
Viewing logs
...
Go to the Stackdriver
...
Logging > Logs Viewer page in the GCP Console.
Select
Cloud Dataproc Cluster
...
> cdap-<pipeline-name>-<runId>
.
...
Filtering logs
You can use the filter options filter to filter the logs you see. You can filter by component such as datafusion-pipeline-logs
or yarn-resourcemanager
logs, or you can filter by various log levels. Use the dropdown drop-down menu to choose a filter.
...
Downloading logs
...