Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

PostgreSQL datatypes mappings and conversions: 


Design

The suggestion is to create maven submodule PostgreSQL under database-plugins repo.


Sink Properties

User Facing NameTypeDescriptionConstraints
LabelStringLabel for UI
Reference NameStringUniquely identified name for lineage
HostStringPostgreSQL hostRequired (defaults to localhost on UI)
PortNumberSpecific port where PostgreSQL running on

Optional

(default 5432)

DatabaseStringDatabase name to connectRequired
Import QueryString
Query for import dataValid SQL query
UsernameStringDB usernameRequired
PasswordPasswordUser passwordRequired
Bounding QueryStringReturns max and minof split-By FiledValid SQL queryNumber of Splits to GenerateNumberNumber of splits to generate
Split-By Field NameStringField name which will be used to generate splits
Transaction Isolation LevelSelectTransaction isolation level for queries run by this sink
Connection ArgumentsKeyvalue

A list of arbitrary string tag/value pairs as connection arguments, list of properties

https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters


Table NameStringName of a database table to write to
Connect TimeoutNumberThe timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled

Source Properties


User Facing NameTypeDescriptionConstraints
LabelStringLabel for UI
Reference NameStringUniquely identified name for lineage
HostStringPostgreSQL hostRequired (defaults to localhost on UI)
PortNumberSpecific port where PostgreSQL running on

Optional

(default 5432)
DatabaseStringDatabase name to connectRequired
Import QueryStringQuery for import dataValid SQL query
UsernameStringDB usernameRequired
PasswordStringUser passwordRequired
Bounding QueryStringReturns max and minof split-By FiledValid SQL query
Split-By Field NameStringField name which will be used to generate splits
Number of Splits to GenerateNumberNumber of splits to generate
Transaction Isolation LevelSelectTransaction isolation level for queries run by this sink
Connection ArgumentsKeyvalueA list of arbitrary string tag/value pairs as connection arguments, list of properties https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters
Connect TimeoutNumberThe timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled


Action Properties


User Facing NameTypeDescriptionConstraints
LabelStringLabel for UI
HostStringPostgreSQL hostRequired (defaults to localhost on UI)
PortNumberSpecific port where PostgreSQL running on

Optional

(default 5432)
DatabaseStringDatabase name to connectRequired
Username

String

DB usernameRequired
PasswordStringUser passwordRequired
Connection ArgumentsKeyvalue

A list of arbitrary string tag/value pairs as connection arguments, list of properties 

https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters


Database CommandStringDatabase command to runValid SQL query
Connect TimeoutNumberThe timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled


Data Types Mapping

Postgres Data TypeCDAP Schema Data TypeSupportComment
BIGINTSchema.Type.LONG+
BIGSERIALSchema.Type.LONG+Serial is autoincremented
BIT(N)Schema.Type.STRING+Bit strings are strings of 1's and 0's
BIT VARYING(N)Schema.Type.STRING+Bit strings are strings of 1's and 0's
BOOLEANSchema.Type.BOOLEAN+
BYTEASchema.Type.BYTES+
CHARACTERSchema.Type.STRING+
CHARACTER VARYINGSchema.Type.STRING+
DOUBLE PRECISIONSchema.Type.DOUBLE+
INTEGERSchema.Type.INT+
NUMERIC(p, s)/DECIMAL(p, s)Schema.LogicalType.DECIMAL+
REALSchema.Type.FLOAT+
SMALLINTSchema.Type.INT+
SMALLSERIALSchema.Type.INT+Serial is autoincremented
SERIALSchema.Type.INT+Serial is autoincremented
TEXTSchema.Type.STRING+
DATESchema.LogicalType.DATE+
TIME [ (P) ] [ WITHOUT TIME ZONE ]Schema.LogicalType.TIME_MICROS+
TIME [ (P) ] WITH TIME ZONESchema.Type.STRING+
TIMESTAMP [ (P) ] [ WITHOUT TIME ZONE ]Schema.LogicalType.TIMESTAMP_MICROS+
TIMESTAMP [ (P) ] WITH TIME ZONESchema.LogicalType.TIMESTAMP_MICROS+Postgresql converts it to UTC(see "Time Stamps" section)
XMLSchema.Type.STRING+
TSQUERYSchema.Type.STRING+
TSVECTORSchema.Type.STRING+
TXID_SNAPSHOT
-Postgresql specific, see documentation
UUIDSchema.Type.STRING+
BOXSchema.Type.STRING+
CIDRSchema.Type.STRING+
CIRCLESchema.Type.STRING+
INETSchema.Type.STRING+
INTERVALSchema.Type.STRING+
JSONSchema.Type.STRING+
JSONBSchema.Type.STRING+
LINESchema.Type.STRING+
LSEGSchema.Type.STRING+
MACADDRSchema.Type.STRING+
MACADDR8Schema.Type.STRING+
MONEYSchema.Type.STRING+
PATHSchema.Type.STRING+
PG_LSN
-Postgresql specific, see documentation
POINTSchema.Type.STRING+
POLYGONSchema.Type.STRING+



Approach

Create a module postgresql-plugin in database-plugins project, reuse existing database-plugins code if possible. Add PostgreSQL-specific properties to configuration, add support for PostgreSQL-specific datatypes. Update UI widgets JSON definitions.

...