Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Data Fusion by default has access to read and write to Big Query/GCS/Pub-Sub/Spanner/BigTable on the project where the Data Fusion instance is created. If users would like to access other GCP resources or any of the above mentioned GCP resources in a different project, then they would need to follow the instructions below.

...

Create a Data Fusion instance.

Setting up permissions for Datastore or BigQuery

Data Fusion uses a service account to access GCP resources in wranglerWrangler, preview, and for data pipelines running on Dataproc. The service account used for running services in the tenant project such as preview , wrangler and Wrangler is in the following format service-<customer-project-number>@gcp-sa-datafusion.iam.gserviceaccount.com. This service account is already created when Cloud Data Fusion API is enabled on the project. Actual data pipeline execution on the Dataproc cluster happens using the compute engine default service account. Any additional GCP resources that Data Fusion needs access to should have appropriate permissions for both of these service accountaccounts.

Info

To find the customer project number, navigate to the customer project that contains the CDF instance and copy the project number on the Home Page in the Project Info card.

For example, to add access to Datastore from preview and wrangler Wrangler, follow the steps below.:

  1. In the GCP Console, open the IAM & Admin page.

  2. In the left bar click IAM.

  3. Edit roles for service-<some_number>@gcp-sa-datafusion.iam.gserviceaccount.com.

  4. In the Edit permissions page, add the role Cloud Datastore Owner and click on Save.

    Image RemovedImage Added

5. Perform similar steps (i.e. add the same roles) for the compute engine default service account to allow the data pipeline to access Datastore during the its execution on Dataproc.

To provide access to BigQuery, you’ll follow the same steps to add BigQuery Admin and BigQuery Data Owner roles for the Data Fusion service account and the compute engine default service account.

...

Page Properties
hiddentrue

Related issues