Introduction

A plugin that can efficiently export data from Oracle to be used as a source in Hydrator pipelines. Oracle includes command line tools to export data that can be utilized to perform this task.

Use-case

A Hydrator User would like to incorporate export oracle data into a pipeline using a source onto hdfs or local file system using an action plugin that does not require a JDBC connection to perform the export from Oracle.

User Stories

As a Hydrator User I want to export data from Oracle to be used in my hydrator Pipeline.
As a Hydrator User I want a Oracle Export plugin that exports data efficiently using existing existing Oracle tools.
As a Hydrator User I want the export data capability of the Oracle plugin to be based on a sql query that I issue.
User should be able to specify credentials.
Passwords should not be viewable in plain text from inside pipeline viewer or hydrator studio.
User should be able to specify Oracle Instance
User should be able to specify location of EXP Utility.
User should be able to specify type of output.
User should be able to specify location of output files.
User should know of connectivity errors, or malformed queries/output identifier.

Example

User wants to export the data from test table using filter on name='cask' i.e Select * from test where name='cask';

Plugin configurations:

"oracleServerHostname":

"example.com",

"oracleServerPort":

"22",

"serverUsername

"oracleServerUsername":

"oracle",

"serverPassword

"oracleServerPassword":

"oracle@123",

"dbUsername":

"system",

"dbPassword":

"cask",

"oracleHome":

"/u01/app/oracle/product/11.2.0/xe",

"oracleSID":

"cask",

"queryToExecute":

"select

*

from

test

where

name='cask';"

          "columnSeparator" : ","

"outputFilename" : "results

"pathToWriteFinalOutput" : "/tmp/data.csv"

"destinationDirectory

"format" : "

/home/cdap/export/"

Prerequisites on the DB server:

1.Directory should be configured on the server where we want to place the export files.

Steps could be:

CREATE OR REPLACE DIRECTORY test_dir AS '/u01/app/oracle/oradata/';
GRANT READ, WRITE ON DIRECTORY test_dir TO oracle;

2.param.par file should be already present and should contain below:

DIRECTORY = test_dir
DUMPFILE = table.dmp
TABLES = test
QUERY = test:"WHERE name = 'cask'"

3.On execution of plugin,table.dmp would be stored in the test_dir.

csv"

Implementation Tips

Design

Design:

Code Block

title	Oracle export json

{
    "name": "OracleExportAction",
      "plugin": {
        "name": "OracleExportAction",
        "type": "action",
        "label": "OracleExportAction",
        "artifact": {
          "name": "core-plugins",
          "version": "1.4.0-SNAPSHOT",
          "scope": "SYSTEM"
      },
      "properties": {
          "oracleServerHostname": "example.com",
          "oracleServerPort": "22",
          "serverUsernameoracleServerUsername": "oracle",
          "serverPasswordoracleServerPassword": "oracle@123",
          "dbUsername": "system",
          "dbPassword": "cask",
          "oracleHome": "/u01/app/oracle/product/11.2.0/xe",
          "oracleSID": "cask",
          "queryToExecute": "select * from test where name='cask';"
          "paramFilePathpathToWriteFinalOutput" : "/parameterstmp/paramdata.parcsv"
          "format" : }"csv"
      }
}

oracleServerHostname:Host nameHostname of the remote DB machine where the data dump command is to be executed

oracleServerPort:Port of the remote DB machine to connect to.Defaults to 22

serverUsernameoracleServerUsername:UserUsername name used to connect to for remote DB host

serverPasswordoracleServerPassword:Password used to connect tofor remote DB host

dbUsername:UserUsername name to connect to oracle DB

dbPassword:Password to connect to oracle DB

oracleHome:Path of the ORACLE_HOME

oracleSID:Oracle SIDparamFilePath

queryToExecute:Full path of the PARAM file.Param file should contain required parameters to run data dump command. e.g If you wantQuery to be executed to export the data from test table using filter for name column-
  DIRECTORY = test_dir(This should be already configured and .Query should have required access)
  DUMPFILE = table.dmp(dump file name)
  TABLES = test(tables on which data export needs ';' at the end.

pathToWriteFinalOutput: Path where output file to be done)exported

format: Format QUERYof = test:"WHERE name = 'cask'"the output file

Plugin would perform below stepsrun below sequence of commands in one session:

1.SSH to the box using the provided $serverUsername and $userPassword.

2.Run below commands:

a.export ORACLE_HOME = $oracleHome;

b.export ORACLE_SID = $oracleSID

c.$oracleHome+/bin/./expdp $dbUsername/$dbUserpassword parfile=$paramFilePath

On comepletion of above commands,dump file would be generated in the DIRECTORY specified in param file.

export ORACLE_HOME and ORACLE_SID

2.create a script file(/tmp/test.sql) and add below content.We can take the path of the tmp file as a config from the user or use the home folder of the logged in user where program would always have the access.

set colsep ","
set linesize 10000
set newpage none
set wrap off
set heading off
spool on
select * from test;
spool off
exit

3.execute $oracleHome/bin/sqlplus -s $dbUsername/$dbPassword@$oracleSID @/tmp/test.sql

4.Read the outstream and write into the specified output file on local.Before writing,multiline trailing spaces remover regex will be applied.

5.Since sqlplus spool generates trailing spaces before and after the column separators,sed command will be applied to remove the spaces.

Table of Contents

Table of Contents

style	circle

Checklist

User stories documented
User stories reviewed
Design documented
Design reviewed
Feature merged
Examples and guides
Integration tests
Documentation for feature
Short video demonstrating the feature

Versions Compared

Old Version 11

New Version Current

Key

Introduction

User Stories

Example

Design

Page Comparison

Versions Compared

Old Version 11

New Version Current

Key

Introduction

User Stories

Example

Design