This document provides a step-by-step guide to build a Data Fusion data pipeline that reads data from Postgres, transforms the data, and writes to Cloud BigQuery.
Prerequisites
Before creating a Cloud Data Fusion pipeline that reads data from PostgreSQL and writes to BigQuery, make sure PostgreSQL is set up and accessible from the Cloud Data Fusion instance.
Instructions
Add a PostgreSQL password as a secure key to encrypt, and store it on a Data Fusion instance.
On any of the pipeline pages on Cloud Data Fusion, click the System Admin tab in the top right menu.
...
4. In the dropdown menu, choose PUT.
5. In the body of your HTTP call, enter
...
Replace “<your_password>” with your PostgreSQL password.
...
Connect to PostgreSQL using Wrangler
1. Navigate to the Wrangler page.
...
8. In the Add connection: Database window, click on the database type you chose in the previous steps. It should now appear with your JAR name underneath it , instead of the previous Upload link.
...
Tip |
---|
Once you’ve completed all the steps, you will be able to click on the newly-connected database in the left navigation panel and see the list of tables for that database. |
Transform data using Wrangler and build your Data Fusion pipeline
This section uses an example to demonstrate how to transform data. We search for a “persons” table and remove the “first_name” column from the table.
...
Tip |
---|
Once the above pipeline succeeds, preview the written data in BigQuery. |
Related articles
How to use JDBC drivers with Cloud Data Fusion
Page Properties | ||
---|---|---|
| ||
|