...
PostgreSQL connector reference: https://jdbc.postgresql.org/download/postgresql-9.4.1211.jar
Existing database plugins: https://github.com/cdapio/hydrator-plugins/tree/develop/database-plugins
PostgreSQL datatypes mappings and conversions:
Design
The suggestion is to create maven submodule PostgreSQL under database-plugins repo.
Sink Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | |
Host | String | PostgreSQL host | Required (defaults to localhost on UI) |
Port | Number | Specific port where PostgreSQL running on | Optional (default 5432) |
Database | String | Database name to connect | Required |
Username | String | DB username | Required |
Password | Password | User password | Required |
Transaction Isolation Level | Select | Transaction isolation level for queries run by this sink | |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters | |
Table Name | String | Name of a database table to write to | |
Connect Timeout | Number | The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled |
Source Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Reference Name | String | Uniquely identified name for lineage | |
Host | String | PostgreSQL host | Required (defaults to localhost on UI) |
Port | Number | Specific port where PostgreSQL running on | Optional (default 5432) |
Database | String | Database name to connect | Required |
Import Query | String | Query for import data | Valid SQL query |
Username | String | DB username | Required |
Password | String | User password | Required |
Bounding Query | String | Returns max and minof split-By Filed | Valid SQL query |
Split-By Field Name | String | Field name which will be used to generate splits | |
Number of Splits to Generate | Number | Number of splits to generate | |
Transaction Isolation Level | Select | Transaction isolation level for queries run by this sink | |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters | |
Connect Timeout | Number | The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled |
Action Properties
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Label | String | Label for UI | |
Host | String | PostgreSQL host | Required (defaults to localhost on UI) |
Port | Number | Specific port where PostgreSQL running on | Optional (default 5432) |
Database | String | Database name to connect | Required |
Username | String | DB username | Required |
Password | String | User password | Required |
Connection Arguments | Keyvalue | A list of arbitrary string tag/value pairs as connection arguments, list of properties https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters | |
Database Command | String | Database command to run | Valid SQL query |
Connect Timeout | Number | The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled |
Data Types Mapping
Postgres Data Type | CDAP Schema Data Type | Support | Comment |
---|---|---|---|
BIGINT | Schema.Type.LONG | + | |
BIGSERIAL | Schema.Type.LONG | + | Serial is autoincremented |
BIT(N) | Schema.Type.STRING | + | Bit strings are strings of 1's and 0's |
BIT VARYING(N) | Schema.Type.STRING | + | Bit strings are strings of 1's and 0's |
BOOLEAN | Schema.Type.BOOLEAN | + | |
BYTEA | Schema.Type.BYTES | + | |
CHARACTER | Schema.Type.STRING | + | |
CHARACTER VARYING | Schema.Type.STRING | + | |
DOUBLE PRECISION | Schema.Type.DOUBLE | + | |
INTEGER | Schema.Type.INT | + | |
NUMERIC(p, s)/DECIMAL(p, s) | Schema.LogicalType.DECIMAL | + | |
REAL | Schema.Type.FLOAT | + | |
SMALLINT | Schema.Type.INT | + | |
SMALLSERIAL | Schema.Type.INT | + | Serial is autoincremented |
SERIAL | Schema.Type.INT | + | Serial is autoincremented |
TEXT | Schema.Type.STRING | + | |
DATE | Schema.LogicalType.DATE | + | |
TIME [ (P) ] [ WITHOUT TIME ZONE ] | Schema.LogicalType.TIME_MICROS | + | |
TIME [ (P) ] WITH TIME ZONE | Schema.Type.STRING | + | |
TIMESTAMP [ (P) ] [ WITHOUT TIME ZONE ] | Schema.LogicalType.TIMESTAMP_MICROS | + | |
TIMESTAMP [ (P) ] WITH TIME ZONE | Schema.LogicalType.TIMESTAMP_MICROS | + | Postgresql converts it to UTC(see "Time Stamps" section) |
XML | Schema.Type.STRING | + | |
TSQUERY | Schema.Type.STRING | + | |
TSVECTOR | Schema.Type.STRING | + | |
TXID_SNAPSHOT | - | Postgresql specific, see documentation | |
UUID | Schema.Type.STRING | + | |
BOX | Schema.Type.STRING | + | |
CIDR | Schema.Type.STRING | + | |
CIRCLE | Schema.Type.STRING | + | |
INET | Schema.Type.STRING | + | |
INTERVAL | Schema.Type.STRING | + | |
JSON | Schema.Type.STRING | + | |
JSONB | Schema.Type.STRING | + | |
LINE | Schema.Type.STRING | + | |
LSEG | Schema.Type.STRING | + | |
MACADDR | Schema.Type.STRING | + | |
MACADDR8 | Schema.Type.STRING | + | |
MONEY | Schema.Type.STRING | + | |
PATH | Schema.Type.STRING | + | |
PG_LSN | - | Postgresql specific, see documentation | |
POINT | Schema.Type.STRING | + | |
POLYGON | Schema.Type.STRING | + |
Approach
Create a module postgresql-plugin in database-plugins project, reuse existing database-plugins code if possible. Add PostgreSQL-specific properties to configuration, add support for PostgreSQL-specific datatypes. Update UI widgets JSON definitions.
...