Checklist
- User Stories Documented
- User Stories Reviewed
- Design Reviewed
- APIs reviewed
- Release priorities assigned
- Test cases reviewed
- Blog post
Introduction
Briefly write the need for this feature
Goals
Clearly state the design goals/requirements for this feature
User Stories
- Breakdown of User-Stories
- User Story #1
- User Story #2
- User Story #3
Design
We will introduce new plugin type "condition" in the pipeline.
Consider the following pipeline for the design purpose:
TRUE File (Source) -> CSV Parser(Transform) -> Filter (Transform) -> Condition1--------> Logistic Regression (Sink) | FALSE | TRUE |-----> Condition2-------> Random Forest (Sink) | FALSE | | TRUE Condition3--------> Decision Tree (Sink)
In the above pipeline, we want to execute the classification algorithm based on the runtime argument 'input.algorithm'. We also do not want to run the expensive model generation process if the Filter transform did not produce the records enough to proceed further.
The pipeline is configured with 3 condition stages:
- Condition1: output.filter Greater Than 1000 AND input.algorithm Equals 'Logistic Regression'
- Condition2: output.filter Greater Than 1000 AND input.algorithm Equals 'Random Forest'
- Condition3: output.filter Greater Than 1000 AND input.algorithm Equals 'Decision Tree'
Representation of the Condition in the Pipeline config
Since conditions are individual stages in the pipeline, they will appear in the connections and states section in the pipeline config json.
{ "connections":[ { "from":"File", "to":"CSV Parser" }, .... { "from":"Condition1", "to":"Logistic Regression", "condition":true }, { "from":"Condition1", "to":"Condition2", "condition":false }, { "from":"Condition2", "to":"Random Forest", "condition":true }, { "from":"Condition2", "to":"Condition3", "condition":false }, { "from":"Condition3", "to":"Decision Tree", "condition":true } ], "stages":[ { "name":"File", ...}, ... { "name":"Condition1", "plugin":{ "name":"Condition", "type":"condition", "label":"Condition1", "artifact":{ "name":"condition-plugins", "version":"1.7.0", "scope":"SYSTEM" }, "properties":{ "conditions":{ "cond1":{ "subject":"output.filter", "operator":"Greater Than", "target":"1000" }, "cond2":{ "subject":"input.algorithm", "operator":"Equals", "target":"Logistic Regression" }, "expressions":[ { "operator":"AND", "operands":[ "cond1", "cond2" ] } ] } } } } ] }
Java API for conditions
Condition plugin interface will be as follows:
/** * Represents condition to be executed in pipeline. */ public abstract class Condition { public static final String PLUGIN_TYPE = "condition"; /** * Implement this method to execute the code as a part of execution of the condition. * If this method returns {@code true}, true branch will get executed, otherwise false * branch will get executed. * @param context the condition context, containing information about the pipeline run * @throws Exception when there is failure in method execution */ public abstract boolean apply(ConditionContext context) throws Exception; } /** * Represents the context available to the condition plugin during runtime. */ public interface ConditionContext extends StageContext, Transactional, SecureStore, SecureStoreManager { /** * Return the arguments which can be updated. */ SettableArguments getArguments(); }
Approach
Approach #1
Approach #2
API changes
New Programmatic APIs
New Java APIs introduced (both user facing and internal)
Deprecated Programmatic APIs
New REST APIs
Path | Method | Description | Response Code | Response |
---|---|---|---|---|
/v3/apps/<app-id> | GET | Returns the application spec for a given application | 200 - On success 404 - When application is not available 500 - Any internal errors |
|
Deprecated REST API
Path | Method | Description |
---|---|---|
/v3/apps/<app-id> | GET | Returns the application spec for a given application |
CLI Impact or Changes
- Impact #1
- Impact #2
- Impact #3
UI Impact or Changes
- Impact #1
- Impact #2
- Impact #3
Security Impact
What's the impact on Authorization and how does the design take care of this aspect
Impact on Infrastructure Outages
System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect
Test Scenarios
Test ID | Test Description | Expected Results |
---|---|---|
Releases
Release X.Y.Z
Release X.Y.Z
Related Work
- Work #1
- Work #2
- Work #3