This document provides step by step guide to build a Data Fusion data pipeline that reads data from IBM DB2 database and writes to GCP specifically BigQuery.
Prerequisite
Before starting to build Cloud Data Fusion pipeline that reads data from IBM DB2 database and writes to BigQuery, make sure DB2 is set up and accessible from Cloud Data Fusion instance.
Instructions
Upload the DB2 database driver to Cloud Data Fusion.
Go to Hub.
Click on Drivers and select the IBM DB2 driver from the list.
Clicking on the IBM DB2 icon asks you to follow 2 step process.
Download the IBM DB2 driver based on the instructions provided.
Find the db2jcc4.jar in the downloaded zip file and upload it to Cloud Data Fusion by clicking on the Deploy.
Provide the configurations by clicking on the Driver configuration.
Note the name of the driver specified (db211 in this case) which will be used while configuring the plugins. Click Finish to complete the driver uploading process.
...
Deploy IBM DB2 plugins from Hub.
Go back to Hub again and find the DB2 plugins under plugins section.
Click on the DB2 plugins and then Deploy them on Cloud Data Fusion instance.
Once plugins are deployed, click on Create Pipeline to go to Data pipeline studio.
Create data pipeline.
Select the IBM DB2 source from the list of sources.
Configure the DB2 plugin by clicking on the properties. Provide the basic configurations as follows:
where,
Driver name: Name of the driver provided while uploading the DB2 driver
Host: Host where the DB2 server is running
Port: Port of DB2 server
Database: Name of the DB2 database
Import Query: Select query to import data from the DB2 table
Also provide credentials required to connect to the DB2 server.
Once the details are filled, click on the GetSchema button followed by Execute to import schema from the source table as
Click Apply to use this schema as an output schema for the plugin.
Click on the sink and choose BigQuery sink.
Configure BigQuery sink as below specifically by providing the Dataset and Table name as:
Click on Deploy once the pipeline is built.
Run the data pipeline.
Click on Run to run the pipeline.
Tip |
---|
Once the above pipeline succeeds, Preview the written data in BigQuery. |
Related articles
Filter by label (Content by label) | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...