...
For connecting a private CloudSQL instance we need a proxy GCE VM that helps us to connect Cloud Data Fusion to CloudSQL.
Instructions
Create a private Cloud Data Fusion instance as per instructions here. This involves
Setting up VPC network
Allocate Allocating IP range
Create a private Cloud SQL instance in the same VPC network as Cloud Data Fusion instance.
Once the instance is created, go to the CloudSQL instance details page to the “Connect to this instance” section and copy the IP and the connection name.
Now that both the instances are setup set up, create a private GCE VM (with only internal IP) using the following gcloud command,:
Code Block |
---|
export PROJECT=<customer-project> export REGION=<vm-region> export ZONE=`gcloud compute zones list --filter="name=${REGION}" --limit 1 --uri --project=${PROJECT}| sed 's/.*\///'` export SUBNET=<customer-vpc-subnet-name> export NAME=<gce-vm-name> export MYSQL_CONN=<mysql-instance-connection-name> # Create the VM gcloud beta compute --project=${PROJECT} instances create ${NAME} --zone=${ZONE} --machine-type=g1-small --subnet=${SUBNET} --no-address --metadata=startup-script="docker run -d -p 0.0.0.0:3306:3306 gcr.io/cloudsql-docker/gce-proxy:1.16 /cloud_sql_proxy -instances=${MYSQL_CONN}=tcp:0.0.0.0:3306" --maintenance-policy=MIGRATE --scopes=https://www.googleapis.com/auth/cloud-platform --image=cos-69-10895-385-0 --image-project=cos-cloud # Get the VM internal IP export IP=`gcloud compute instances describe ${NAME} --zone ${ZONE} | grep "networkIP" | awk '{print $2}'` # Promote the VM internal IP to static IP gcloud compute addresses create mysql-proxy --addresses ${IP} --region ${REGION} --subnet ${SUBNET} # Note down the IP to be used in CDF MySQL JDBC connection string echo ${IP} |
Once the VM is created with a static IP, now use this as the host for the mysql database to be accessed from Cloud Data Fusion.
...
An example of accessing the mysql database from Wrangler can look like the following screenshot,:
...
The same can be accessed directly from pipeline studio, using the Pipeline Studio by using the Database source plugin with similar properties (JDBC connection string, username and password & plugin type), like the following screenshot,:
...
A similar setup should work for both Postgres and SQLServer instances as well.
...