DB2 database plugin
Introduction
A separate database plugin to support DB2-specific features and configurations.
Use-Case
- Users can choose and install DB2 source and sink plugins.
- Users should see DB2 logo on plugin configuration page for better experience.
- Users should get relevant information from the tool tip:
- The tool tip for the connection string should be customized specifically to the DB2 database,
- The tool tip should describe accurately what each field is used for.
- Users should not have to specify any redundant configuration (ex: JDBC type in source plugin, columns in the sink plugin).
- Users should get field level lineage for the source and sink that is being used.
- Reference documentation should be updated to account for the changes.
- The source code for DB2 database plugin should be placed in repo under data-integrations org.
- Integration tests for DB2 database plugin should be added in the test repo.
- The data pipeline using source and sink plugins should run on both mapreduce and spark engines.
User Stories
- User should be able to install DB2 specific database source and sink plugins from the Hub
- Users should have each tool tip accurately describe what each field does
- Users should get field level lineage information for the DB2 source and sink
- Users should be able to setup a pipeline avoiding specifying redundant information
- Users should get updated reference document for DB2 source and sink
- Users should be able to read all the DB types
Plugin Type
- Batch Source
- Batch Sink
- Real-time Source
- Real-time Sink
- Action
- Post-Run Action
- Aggregate
- Join
- Spark Model
- Spark Compute
Design Tips
DB2 connector reference: https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.apdv.java.doc/src/tpc/imjcc_r0052075.html
Existing database plugins: https://github.com/cdapio/hydrator-plugins/tree/develop/database-plugins
DB2 datatypes mappings and conversions: https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.apdv.java.doc/src/tpc/imjcc_rjvjdata.html
Design
The suggestion is to create maven submodule db2-plugin under database-plugins repo.
Sink Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | |
Host | String | DB2 host | Required (defaults to localhost on UI) |
Port | Number | Specific port which DB2 is listening to | Optional (default 50000) |
Database | String | Database name to connect | Required |
Username | String | DB username | Required |
Password | Password | User password | Required |
Transaction Isolation Level | Select | Transaction isolation level for queries run by this sink | |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties: | |
Table Name | String | Name of a database table to write to |
Source Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | |
Host | String | DB2 host | Required (defaults to localhost on UI) |
Port | Number | Specific port which DB2 is listening to | Optional (default 50000) |
Database | String | Database name to connect | Required |
Import Query | String | Query for import data | Valid SQL query |
Username | String | DB username | Required |
Password | String | User password | Required |
Bounding Query | String | Returns max and min of split-By Filed | Valid SQL query |
Split-By Field Name | String | Field name which will be used to generate splits | |
Number of Splits to Generate | Number | Number of splits to generate | |
Transaction Isolation Level | Select | Transaction isolation level for queries run by this sink | |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties: |
Action Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Host | String | DB2 host | Required (defaults to localhost on UI) |
Port | Number | Specific port which DB2 is listening to | Optional (default 50000) |
Database | String | Database name to connect | Required |
Username | String | DB username | Required |
Password | String | User password | Required |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties: | |
Database Command | String | Database command to run | Valid SQL query |
Data Types Mapping
DB2 Data Type | CDAP Schema Data Type | Support | Comment |
---|---|---|---|
SMALLINT | Schema.Type.INT | + | |
INTEGER | Schema.Type.INT | + | |
BIGINT | Schema.Type.LONG | + | |
DECIMAL(p,s) or NUMERIC(p,s) | Schema.LogicalType.DECIMAL | + | |
DECFLOAT | Schema.Type.STRING | + | |
REAL | Schema.Type.FLOAT | + | |
DOUBLE | Schema.Type.DOUBLE | + | |
CHAR | Schema.Type.STRING | + | |
VARCHAR | Schema.Type.STRING | + | |
CHAR(n) FOR BIT DATA | Schema.Type.BYTES | + | |
VARCHAR(n) FOR BIT DATA | Schema.Type.BYTES | + | |
BINARY | Schema.Type.BYTES | + | |
VARBINARY | Schema.Type.BYTES | + | |
GRAPHIC | Schema.Type.STRING | + | |
VARGRAPHIC | Schema.Type.STRING | + | |
CLOB | Schema.Type.STRING | + | |
BLOB | Schema.Type.BYTES | + | |
DBCLOB | Schema.Type.STRING | + | |
ROWID | - | Not all flavors of DB2 support this type | |
XML | - | Not all flavors of DB2 support this type | |
DATE | Schema.LogicalType.DATE | + | |
TIME | Schema.LogicalType.TIME_MICROS | + | |
TIMESTAMP | Schema.LogicalType.TIMESTAMP_MICROS | + |
Approach
Create a module db2-plugin in database-plugins project, reuse existing database-plugins code if possible. Add DB2-specific properties to configuration, add support for DB2-specific datatypes. Update UI widgets JSON definitions.
Pipeline Samples
API changes
Deprecated Programmatic APIs
database-plugins is moved to Data Integrations
UI Impact or Changes
Configurable database properties are presented as named text fields instead of arbitrary key value pairs. DB2 source and sink are separate entries with DB2 logo in source and sink lists.
Test Scenarios
TODO
Releases
Release X.Y.Z
Related Work
Future work
AuroraDB database plugin