Versions Compared
compared with
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Introduction
In some instances, the results of a pipeline may need to be posted to an external webservice. For example, you could have a processing pipeline that would send messages to Slack via the rest endpoint or you may want to send notifications to a 3rd party website. This sink would send the messages from the pipeline to an external http endpoint.
Use case(s)
- I would like to post a notification to Slack every time a user from my website sees a 500 error. I would like to set up a realtime spark streaming pipeline to read my weblog data, filter for messages that have a 500 error, and post a custom message to slack with details from the message such as the url.
- I am leveraging a 3rd party reporting tool for updating metrics in a dashboard. I would like to create a realtime pipeline to generate those metrics and post them to the 3rd party reporting service. I would like to use a realtime spark streaming pipeline, configure windows for aggregations, then send those aggregated stats to the 3rd party using this HTTP Sink.
User Storie(s)
- As a pipeline developer, i would like to post data to an external webservice by providing the request method (GET, POST, PUT, DELETE), url, payload (If POST or PUT), request headers, timeouts.
- As a pipeline developer, i would like to be able to define a custom POST payload leveraging fields from the message.
- As a pipeline developer, I would like to batch my updates if required, so that it would post to the external service only when n number of messages has been sent.
- As a pipeline developer, I would like the plugin to retry an configurable amount of time before failing the pipeline
- As a pipeline developer, I would like to be able to send basic auth credentials by providing a username and password in the config
- As a pipeline developer, I would like to be able to send to http and https endpoints.
Plugin Type
- BatchSink
Configurables
This section defines properties that are configurable for this plugin.
User Facing Name | Type | Description | Constraints | Macro Enabled? |
---|---|---|---|---|
URL | String | Required. The URL to post data to. | yes | |
Request Method | Select | The HTTP request method. | GET, POST, PUT, DELETE | |
Batch Size | String | The number of messages to batch before sending | > 0, default 1 (no batching) | yes |
Format | Select | The format to send the message in. JSON will format the entire input record to json and send it as a payload. Form will convert the input message to a query string and send it in the payload. Custom will leverage the request body field to send. | JSON, Form, Custom | |
Request Body | String | Optional request body. Only required if Custom format is specified. | yes | |
Content Type | String | Used to specify the Content-Type header. | yes | |
Request Headers | KeyValue | An optional string of header values to send in each request where the keys and values are | yes | |
Should Follow Redirects? | Select | Whether to automatically follow redirects. Defaults to true. | true,false | |
Number of Retries | Select | The number of times the request should be retried if the request fails. Defaults to 3. | 0,1,2,3,4,5,6,7,8,9,10 | |
Connect Timeout | String | The time in milliseconds to wait for a connection. Set to 0 for infinite. Defaults to 60000 (1 minute). | ||
Read Timeout | String | The time in milliseconds to wait for a read. Set to 0 for infinite. Defaults to 60000 (1 minute). |
Design / Implementation Tips
- Please use HTTPPoller and HTTPCallback in Hydrator plugins as a reference.
- If a user selects json, the content-type header should be set to application/json. Form should be set to application/x-www-form-urlencoded.
- When formatting the message as a query string, don't forget to urlencode the values
- We will need to define some sort of macro language so that the user can leverage message fields in their post payload. For example, i might define my payload as \{ "messageType" : "update", "name" : "%{firstName}" \} where %{firstName} will be substituted for the value that is in firstName in the incoming message.
- For Batching, each message will be sent separated by a newline (\n) character
Design
Code Block |
---|
{
"name": "HTTP Sink",
"plugin": {
"name": "HTTP",
"type": "batchsink",
"label": "HTTP Sink",
"artifact": {
"name": "http-sink-plugin",
"version": "1.6.0",
"scope": "SYSTEM"
},
"properties": {
"referenceName": "HTTP Sink Plugin",
"url": "http://example.com/data",
"method": "POST",
"batchSize": "1",
"messageFormat": "JSON",
"body": "{"text" : "Hello Slack"}",
"delimiterForMessages": "\n",
"chatset": "UTF-8",
"followRedirects": "true",
"disableSSLValidation": "true",
"numRetries": 3,
"connectTimeout": 60000,
"readTimeout": 60000
}
} |
Approach(s)
1.JSON would be default message format.
2.If batchsize > 1, delimiterForMessages would be used to create batch message.
Properties
NFR
1.If user enables SSL validation, they will be expected to add the certificate to the truststore of each machine.
Limitation(s)
Future Work
- Some future work – HYDRATOR-99999
- Another future work – HYDRATOR-99999
Test Case(s)
Sample Pipeline
Table of Contents
Table of Contents style circle
Checklist
- User stories documented
- User stories reviewed
- Design documented
- Design reviewed
- Feature merged
- Examples and guides
- Integration tests
- Documentation for feature
- Short video demonstrating the feature