Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

 

Introduction


Run plugin allows user to run any executable binary installed and available on all Hadoop nodes. The user code is capable of processing the input record and return the output record to be further processed downstream in the pipeline. 

Use-Case


Often times, in enterprise there are existing tools or systems that exist and perform complex transformations of data. These tools are time tested and have been running in production for long time. As more and more processing is being moved to Hadoop, users would like to slowly transition to running on Hadoop. In this case, they would like to have the ability to run the tools as in or with minor modifications. They have the tool or binary installed on all Hadoop nodes and they would like the ability to pass the processing record into the tool and retrieve the results back into the pipeline. 

User Stories

  • User should be able to specify the fully path to binary or just the binary
  • User should be able to specify the arguments for the binary to be executed
  • User should be provided a specification about how the record is passed to binary (need to be designed)
  • If binary executable doesn’t exist or not in path or not executable, user should be notified appropriately during runtime
  • User is able to see the errors in log if the executable writes the errors to STDERR
  • User executable is able to read the record from STDIN
  • User executable is able to write the record to STDOUT
  • User executable is able to write the error records to a different FILE descriptor 
  • User should be able to specify the error dataset to which data returned from executable special FILE descriptor
  • User will make sure the binary and it’s dependencies are available on all machines of the cluster and no capability needs to be added to the plugin for marshaling the executable

Implementation Tips


Table of Contents

Checklist

  • User stories documented 
  • User stories reviewed 
  • Design documented 
  • Design reviewed 
  • Feature merged 
  • Examples and guides 
  • Integration tests 
  • Documentation for feature 
  • Short video demonstrating the feature
  • No labels