Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This article will outline the steps required connect to sources and sinks that reside outside of the customer project. The workflow within Data Fusion is the same, you will simply need to provide access to a couple of service accounts so CDF can access them.

Service Account Description

There is some confusion regarding the various service accounts Data Fusion creates/uses to operate. This table provides a breakdown for each service account and its function

Service account name format

Description

Uses in CDF

service-<project_number>@gcp-sa-datafusion.iam.gserviceaccount.com

Service account used CDF in the tenant project to access resources in the customer project.

This account is used to:

  • Access resources during Preview

  • Access resources from Wrangler

  • Create Dataproc cluster in customer project

<project_number>-compute@developer.gserviceaccount.com

Default service account used by the Dataproc VMs.

This account is used to:

  • Access resources during a pipeline run for a deployed pipeline

Steps to connect external resources

  1. Navigate to the customer project that contains the CDF instance and copy the project number (this is found on the Home Page in the Project Info card)

  2. Navigate to the project that contains the resources you would like to interact with.

  3. In the sidebar, click on ‘IAM & Admin

  4. Click on ‘Add’ at the top of the page.

  5. Provide the first service account name from the table above, be sure to replace <project_number> with the actual number you obtained in step 1

  6. Grant the Admin role for the resource you would like to interact with. Ex. BigQuery Admin for reading/writing to BigQuery.

  7. Repeat steps 5 & 6 for the second service account in the table above.

  8. In your pipeline, ensure you define the correct Project Id for the sources/sinks. Using ‘auto-detect’ will default to the customer project that contains the CDF instance.

  • No labels