...
The proposal is to use an internal stage since files can then be downloaded directly using SnowflakeConnection#downloadStream method.
Source Splitter
The proposal is to determine the number of splits according to the number of staged files that were created using COPY INTO <location> command. The number of resulting files can be controlled using MAX_FILE_SIZE Copy Options. The proposal is to add "Max Split Size" Source configuration option which will use MAX_FILE_SIZE Copy Option.
LIST command returns a list of files that have been staged.
Example of listing the files that match a regular expression (i.e. all file names containing the string data_0) in a named stage (my_csv_stage) with a prefix (/analysis/):
Code Block |
---|
list @my_csv_stage/analysis/ pattern='.*data_0.*';
+--------------------+------+----------------------------------+------------------------------+
| name | size | md5 | last_modified |
|--------------------+------+----------------------------------+------------------------------|
| employees01.csv.gz | 288 | a851f2cc56138b0cd16cb603a97e74b1 | Tue, 9 Jan 2018 15:31:44 GMT |
| employees02.csv.gz | 288 | 125f5645ea500b0fde0cdd5f54029db9 | Tue, 9 Jan 2018 15:31:44 GMT |
| employees03.csv.gz | 304 | eafee33d3e62f079a054260503ddb921 | Tue, 9 Jan 2018 15:31:45 GMT |
| employees04.csv.gz | 304 | 9984ab077684fbcec93ae37479fa2f4d | Tue, 9 Jan 2018 15:31:44 GMT |
| employees05.csv.gz | 304 | 8ad4dc63a095332e158786cb6e8532d0 | Tue, 9 Jan 2018 15:31:44 GMT |
+--------------------+------+----------------------------------+------------------------------+ |
Sink Plugin
For the Sink, it's possible to write data to the internal stage files first(according to the File Sizing Recommendations) and then use COPY INTO <table> command.
...
- Please, refer Plugin OAuth2 Common Module for OAuth2 common module design information.
Option 2
Section | User Configuration Label | Label Description | Options | Default | Variable | User Widget |
---|---|---|---|---|---|---|
General | Label | Label for UI. | textbox | |||
Reference Name | Uniquely identified name for lineage. | referenceName | textbox | |||
Account Name | Full name of Snowflake account. | accountName | textbox | |||
Database | Database name to connect to. | database | textbox | |||
Import Query | Query for import data. | importQuery | textarea | |||
Credentials | Username | User identity for connecting to the specified database. | username | textbox | ||
Password | Password to use to connect to the specified database. | password | password | |||
Key Pair Authentication | Key Pair Authentication Enabled | If true, plugin will perform Key Pair authentication. |
| False | keyPairEnabled | toggle |
Key File Path | Path to the private key file. | path | textbox | |||
OAuth2 | OAuth2 Enabled | If true, plugin will perform OAuth2 authentication. |
| False | oauth2Enabled | toggle |
Auth URL | Endpoint for the authorization server used to retrieve the authorization code. | authUrl | textbox | |||
Token URL | Endpoint for the resource server, which exchanges the authorization code for an access token. | tokenUrl | textbox | |||
Client ID | Client identifier obtained during the Application registration process. | clientId | textbox | |||
Client Secret | Client secret obtained during the Application registration process. | clientSecret | password | |||
Scopes | Scope of the access request, which might have multiple space-separated values. | scopes | textbox | |||
Refresh Token | Token used to receive accessToken, which is end product of OAuth2. | refreshToken | textbox | |||
Advanced | Connection Arguments | A list of arbitrary string tag/value pairs as connection arguments. See: https://docs.snowflake.net/manuals/user-guide/jdbc-configure.html#jdbc-driver-connection-string | connectionArguments | keyvalue |
Notes:
- The table above is similar to the Option1.1 one except of missing splitter-related properties. Please, refer to the Design section for the proposal of the splitter design.
...