Table of Contents |
---|
Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
Introduction
Briefly write the need for this feature
Goals
Clearly state the design goals/requirements for this feature
User Stories
- Breakdown of User-Stories
- User Story #1
- User Story #2
- User Story #3
Design
Consider the following pipeline for the design purpose:
Code Block |
---|
TRUE File (Source) -> CSV Parser(Transform) -> Filter (Transform) -> Condition1--------> Logistic Regression (Sink) | FALSE | TRUE |-----> Condition2-------> Random Forest (Sink) | FALSE | | TRUE Condition3--------> Decision Tree (Sink) |
In
the
above
pipeline,
we
want
to
execute
the
classification
algorithm
based
on
the
runtime
argument
'input.algorithm'.
We
also
do
not
want
to
run
the
expensive
model
generation
process
if
the
Filter
transform
did
not
produce
the
records
enough
to
proceed
further. The pipeline is configured with 3 Condition nodes - 1. Condition1 node: output.filter Greater Than 1000 && input.algorithm Equals 'Logistic Regression' 2. Condition2 node: output.filter Greater Than 1000 && input.algorithm Equals 'Random Forest' 3. Condition3 node: output.filter Greater Than 1000 && input.algorithm Equals 'Decision Tree'Representation of the Condition in the Pipeline config
further.
The pipeline is configured with 3 condition nodes:
- Condition1: output.filter Greater Than 1000 AND input.algorithm Equals 'Logistic Regression'
- Condition2: output.filter Greater Than 1000 AND input.algorithm Equals 'Random Forest'
- Condition3: output.filter Greater Than 1000 AND input.algorithm Equals 'Decision Tree'
Representation of the Condition in the Pipeline config
Following is one possible representation of the condition stage in the config json. Since conditions are individual stages, they will also appear in the connections section similar to other stages.
Code Block |
---|
{
"name":"Condition1",
"plugin":{
"name":"Condition",
"type":"condition",
"label":"Condition1",
"artifact":{
"name":"condition-plugins",
"version":"1.7.0",
"scope":"SYSTEM"
},
"properties":{
"conditions":{
"cond1":{
"subject":"output.filter",
"operator":"Greater Than",
"target":"1000"
},
"cond2":{
"subject":"input.algorithm",
"operator":"Equals",
"target":"Logistic Regression"
},
"expressions":[
{
"operator":"AND",
"operand1":"cond1",
"operand2":"cond2"
}
],
"connectors":{
"TRUE":"LogisticRegressionStage",
"FALSE":"Condition2"
}
}
}
}
}
|
Approach
Approach #1
Approach #2
API changes
New Programmatic APIs
New Java APIs introduced (both user facing and internal)
Deprecated Programmatic APIs
New REST APIs
Path | Method | Description | Response Code | Response |
---|---|---|---|---|
/v3/apps/<app-id> | GET | Returns the application spec for a given application | 200 - On success 404 - When application is not available 500 - Any internal errors |
|
Deprecated REST API
Path | Method | Description |
---|---|---|
/v3/apps/<app-id> | GET | Returns the application spec for a given application |
CLI Impact or Changes
- Impact #1
- Impact #2
- Impact #3
UI Impact or Changes
- Impact #1
- Impact #2
- Impact #3
Security Impact
What's the impact on Authorization and how does the design take care of this aspect
Impact on Infrastructure Outages
System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
Releases
Release X.Y.Z
Release X.Y.Z
Related Work
- Work #1
- Work #2
- Work #3