Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

An option of IAM role based authentication in the existing S3 source and sink plugins.

Use case(s)

  • In the S3 source and S3 sink(Avro and Parquet) plugins,there should be a provision for user to select authentication mechanism for S3.User should have an option to select IAM role based authentication in the plugins.

User Storie(s)

  • As a pipeline user,i want to have an option of IAM role based authentication in the S3 source and sink plugins in Hydrator.
  • As a pipeline user,i want access ID and access key to be mandatory for Access Credentials authentication method.

Plugin Type

  •  Batch Source
  •  Batch Sink 
  •  Real-time Source
  •  Real-time Sink
  •  Action
  •  Post-Run Action
  •  Aggregate
  •  Join
  •  Spark Model
  •  Spark Compute

Configurables

New Configuration would be added in the S3 plugin

Since this is an EC2+IAM capability,user will with roles assigned to all the members.
User Facing NameTypeDescriptionConstraints
Authentication MethodSelect

Authentication

method

to

access

S3.

Defaults

to

Access

Credentials

.
User need to have AWS environment

only to use IAM role based authentication.Non-EC2 environment can not be used.
For Access Credentials, URI scheme should be s3n://. For IAM, URI scheme should be s3a://. (Macro-enabled)

 
    

Design / Implementation Tips

  • Tip #1
  • Tip #2

Design

{
"widget-type": "select",
"label": "Authentication Method",
"name": "authenticationMethod",
"widget-attributes": {
"values": [
"Access Credentials",
"IAM"
],
"default": "Access Credentials"
}
}

Approach(s)

1.When user selected IAM role based authentication method,need to omit the properties related to keys.2.S3

Properties

Security

Limitation(s)

1.For all the S3 plugins, S3 regions which are supporting both the signature versions(Version 2 and Version 4) are only supported.

32.User need to have AWS environment only to use IAM role based authentication.Non-EC2 environment can not be used.

3.User would have to use s3a hadoop client only to use IAM authentication.(URI scheme: s3a://)

Properties

Security

Limitation(s

)

Future Work

  • Some future work – HYDRATOR-99999
  • Another future work – HYDRATOR-99999

Test Case(s)

  • S3batch source with IAM role based authentication
  • S3batchsource with key credentials
  • S3Avrosink with IAM role based authentication
  • S3AvroSink with key credentials
  • S3ParquetSink with IAM role based authentication
  • S3ParquetSink with key credentials

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data. 

Pipeline #1

Pipeline #2

 

 

Table of Contents

Table of Contents
stylecircle

Checklist

  •  User stories documented 
  •  User stories reviewed 
  •  Design documented 
  •  Design reviewed 
  •  Feature merged 
  •  Examples and guides 
  •  Integration tests 
  •  Documentation for feature 
  •  Short video demonstrating the feature