...
Additional Plugins:
SQL Action:
- Extends Action Class
- Similar logic to QueryAction Plugin
- Config Properties:
- use QueryConfig
HDFS Action:
- Extends Action Class
- Config Properties:
- sourcePath: Path of file/directory
- destPath: Path of desired destination
- fileRegex: wildcard used to identify types of files to run
- Run Method:
- Pull all files in given path and filter them if `fileRegex` is set using a `WildCardFileFilter`
- For each file, execute a `FileSystem.rename(src, dest)` call to move it to the desired location. Would require getting the hdfs fileSystem first.
File Action:
- Moves files between remote machines
- Extends SSHAction Class
- Config Properties:
- Authentication to ssh into destination machine
- Authentication to get file(s) from source machine
- Run Method:
- call super class run method.
- build scp or ftp command and pass it to super's constructor
Storing SSH keys:
- SSH private and public keys will be generated for the specific user. These keys will be used to SSH to the external machine.
- User's public key can be hosted on a machine to which we want to do SSH from YARN container running action.
- User's private key will need to be stored on the YARN cluster so that it can be accessed by any container. Following are few options for the same -
- Store the private key as dataset in the HDFS
Store private key in the SecureStore. If program has access to the SecureStore then the ActionContext will need to expose it.
...