Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Cloud Data Fusion integrates with Stackdriver Logging, which allows you to collect and view your Cloud Data Fusion pipeline logs. Cloud Data Fusion’s Stackdriver logging contains logs from various resources such as Yarn YARN and HDFS from the Dataproc cluster. Viewing these logs can provide insight to the lifecycle and resource usage of your pipeline on the Dataproc cluster, which can help you debug and fine-tune your pipeline.

...

To use Stackdriver Logging with your Cloud Data Fusion pipeline, you need to enable Stackdriver Logging when you create your Cloud Data Fusion instance.

...

If Stackdriver Logging

...

was not enabled during instance creation.

...

It can be enabled using the gcloud command, example:

gcloud beta data-fusion instances update $INSTANCE_NAME \
--project=$PROJECT \
--location=$LOCATION \
--enable_stackdriver_logging

  1. In the GCP Console Data Fusion Instances page, click Create instance.

  2. Click Show advanced options.

  3. Under Logging and monitoring, click Enable Stackdriver logging service.

...

Viewing logs in Stackdriver

...

You can use the filter options filter to filter the logs you see. You can filter by component such as datafusion-pipeline-logs or yarn-resourcemanager logs, or you can filter by various log levels. Use the dropdown drop-down menu to choose a filter.

...

Downloading logs

...