Running pipelines in Shared VPC
This page describe how to launch a pipeline in GCP Shared VPC. To learn more about Shared VPC please refer to GCP documentation.
As of 6.1.0.5, it is possible to provision a private CDF instance via REST API and peer it with a Shared VPC (go/cdf-private-instance-guide-ext). In this case, these steps are NOT necessary.
You must have set up a shared VPC.
The shared VPC must be accessible in the service project in which you want to the Dataproc cluster associated with the pipelines to run.
You must have grant Compute Network User role to the Dataproc’s service account of your service project on the shared network.
You must have grant Compute Network User role to the Service Account associated with your instance of CDAP/CDF on the shared network.
Configuring for Shared VPC
Create a new profile following the Creating Profiles documentation.
While creating the profile, set the Network name to your shared VPC network and the Network Host Project ID to the host project of your shared VPC network.
Now you can use this profile while running your pipeline to launch the pipeline and its associated Dataproc cluster in the service project and the network will be the Shared VPC network.
Â