DB Export Netezza
- Todd Greenstein
- Romy Khetan
Introduction
- As a Hydrator User I want a plugin that I can use to export data from Netezza so that I can use the data in my pipeline.
- As a Hydrator User I want a Netezza plugin that exports data efficiently using existing Netezza tools.
- As a Hydrator User I want the export data capability of the Netezza plugin to be based on a sql query that I issue.
- User should be able to specify credentials.
- Passwords should not be viewable in plain text from inside pipeline viewer or hydrator studio.
- User should be able to specify Netezza Instance
- User should be able to specify location of nzqsql Utility.
- User should be able to specify type of output.
- User should be able to specify location of output files.
- User should know of connectivity errors, or malformed queries/output identifier.
Example
Example for how the plugin should work
Implementation Tips
NZSQL is a command line utility that can run remotely to allow exporting. The output format can be specified to a flat file with specified delimiter or an external table.
Command example to use External Table: CREATE EXTERNAL TABLE '/tmp/export.csv' USING (DELIM ',') AS
SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x;
Design
Design:
Input Json Format:
{ "name": "Netezza", "type": "batchsource", "properties": { "userName": "cdap", "password": "****", "host": "localhost", "nzsqlPath": "/opt/netezza", "db": "user", "outputType": "File", "delimiter": ",", "outputPath": "/home/cdap/netezza", "sqlQuery": "select * from table_with_90_columns" } }
Properties:
userName: User name for user db.
- password: Password for user db.
- host: Hostname or IP of the Netezza instance.
- nzsqlPath: Path of nzsql utility.
- db: Name of the DB from on which sql query needs to be executed.
- outputType: Type of the output. It may be Flat file or External table.
- delimiter: Delimiter needs to be specified which can be used while writing records in a flat file.
- outputPath: Path of the output files.
- sqlQuery: Sql query which needs to be executed on the specified db name to export the data.
Approach:
- Ganymed SSH-2 library can be used to connect to the Netezza instance to run NZSQL utility as specified by the user.
- The output format can only be among : Flat files(with specified delimiters) or external table.
- Assuming the sql query provided by the user will be in below format:
For example:
1. nzsql -d {dbname} -u {username} -h {host} -c "select * from table_with_90_columns " -t -F -o output.csv
2. CREATE EXTERNAL TABLE '/tmp/export.csv' USING (DELIM ',') AS SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x;
- User will only specify the query part like "select * from table_with_90_columns" or "SELECT foo.x, bar.y, bar.dt FROM foo, bar WHERE foo.x = bar.x" in "sqlQuery" input and rest of the command has to be created at the plugin side using the inputs provided by the user.
- In the host input, user will specify the hostname for the Netezza machine instance.
Table of Contents
Checklist
- User stories documented
- User stories reviewed
- Design documented
- Design reviewed
- Feature merged
- Examples and guides
- Integration tests
- Documentation for feature
- Short video demonstrating the feature