Versions Compared
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Introduction
Data loading in Snowflake (https://docs.snowflake.net/manuals/user-guide/data-load-overview.html) is performed in two, separate steps. The first step is to load the files into a staging area, and the second step is to trigger the upload with a COPY INTO step. This action plugin will execute both steps using Snowflake JDBC.
Use case(s)
- As a financial services company, I would like to build an analytics pipeline to push data into the Snowflake. I would leverage Hydrator, and use this plugin as action to create files in the staging area and trigger the loading of the data.
User Storie(s)
- I would like to create files in the staging area and trigger loading of the data.
- I would like to have capability to load files from internal and external(s3) location.
Plugin Type
- Action
Configurables
This section defines properties that are configurable for this plugin.
User Facing Name | Type | Description | Constraints |
---|---|---|---|
Design / Implementation Tips
- See this document for an overview: https://docs.snowflake.net/manuals/user-guide/data-load-overview.html
Design
Approach(s)
1.For loading files from internal location,below high level commands would be executed in snowflake using jdbc:
a.create or replace stage
b.PUT files onto staging location from the specified path where globbing patterns (i.e. wildcards) are accepted.
c.COPY INTO specified snowflake table.
2.For loading files from external location i.e S3,below high level commands would be executed in snowflake using jdbc:
a.create or replace stage using URL option
b.COPY INTO specified snowflake table having a single VARIANT column.
Properties
Security
Limitation(s)
1.COPY INTO command ignores staged data files that were already loaded into the table.FORCE parameter is not used since loading of same files is not required.https://docs.snowflake.net/manuals/sql-reference/sql/copy-into-table.html
2.JSON/XML/AVRO/PARQUET file format can produce one and only one column of type variant or object or array. Use CSV file format if you want to load more than one column.
3.Default values for format options are considered.https://docs.snowflake.net/manuals/sql-reference/sql/create-stage.html
4.Create or replace stage command is used to create stage so that if stage already exists with different set of files,would be replaced.
Future Work
- Some future work – HYDRATOR-99999
- Another future work – HYDRATOR-99999
Test Case(s)
- Test case #1
- Test case #2
Sample Pipeline
Please attach one or more sample pipeline(s) and associated data.
Pipeline #1
Pipeline #2
Table of Contents
Table of Contents style circle
Checklist
- User stories documented
- User stories reviewed
- Design documented
- Design reviewed
- Feature merged
- Examples and guides
- Integration tests
- Documentation for feature
- Short video demonstrating the feature