Checklist

User Stories Documented
User Stories Reviewed
Design Reviewed
APIs reviewed
Release priorities assigned
Test cases reviewed
Blog post

Introduction

Briefly write the need for this feature

Goals

Clearly state the design goals/requirements for this feature

User Stories

Breakdown of User-Stories
User Story #1
User Story #2
User Story #3

Design

Consider the following pipeline for the design purpose:

																			TRUE
File (Source) -> CSV Parser(Transform) -> Filter (Transform) -> Condition1--------> Logistic Regression (Sink) 
																	|		           
															FALSE	| 				   TRUE
																	|-----> Condition2-------> Random Forest (Sink)
																				|
																		FALSE	|
																				|		TRUE
																			Condition3--------> Decision Tree (Sink)

In the above pipeline, we want to execute the classification algorithm based on the runtime argument 'input.algorithm'. We also do not want to run the expensive model generation process if the Filter transform did not produce the records enough to proceed further.

The pipeline is configured with 3 condition stages:

Condition1: output.filter Greater Than 1000 AND input.algorithm Equals 'Logistic Regression'
Condition2: output.filter Greater Than 1000 AND input.algorithm Equals 'Random Forest'
Condition3: output.filter Greater Than 1000 AND input.algorithm Equals 'Decision Tree'

Representation of the Condition in the Pipeline config

Since conditions are individual stages in the pipeline, they will appear in the connections and states section in the pipeline config json.

{  
   "connections":[  
      {  
         "from":"File",
         "to":"CSV Parser"
      },
	  ....	
      {  
         "from":"Condition1",
         "to":"Logistic Regression",
         "condition":true
      },
      {  
         "from":"Condition1",
         "to":"Condition2",
         "condition":false
      },
      {  
         "from":"Condition2",
         "to":"Random Forest",
         "condition":true
      },
      {  
         "from":"Condition2",
         "to":"Condition3",
         "condition":false
      },
      {  
         "from":"Condition3",
         "to":"Decision Tree",
         "condition":true
      }
   ],
   "stages":[  
	  { "name":"File", ...},
	  ...	
      {  
         "name":"Condition1",
         "plugin":{  
            "name":"Condition",
            "type":"condition",
            "label":"Condition1",
            "artifact":{  
               "name":"condition-plugins",
               "version":"1.7.0",
               "scope":"SYSTEM"
            },
            "properties":{  
               "conditions":{  
                  "cond1":{  
                     "subject":"output.filter",
                     "operator":"Greater Than",
                     "target":"1000"
                  },
                  "cond2":{  
                     "subject":"input.algorithm",
                     "operator":"Equals",
                     "target":"Logistic Regression"
                  },
                  "expressions":[  
                     {  
                        "operator":"AND",
                        "operands":[  
                           "cond1",
                           "cond2"
                        ]
                     }
                  ]
               }
            }
         }
      }
   ]
}

Since conditions are individual stages, they will also appear in the connections section similar to other stages.

Java API for conditions

Approach

Approach #1

Approach #2

API changes

New Programmatic APIs

New Java APIs introduced (both user facing and internal)

Deprecated Programmatic APIs

New REST APIs

Path

Method

Description

Response Code

Response

/v3/apps/<app-id>

GET

Returns the application spec for a given application

200 - On success

404 - When application is not available

500 - Any internal errors

Deprecated REST API

Path	Method	Description
/v3/apps/<app-id>	GET	Returns the application spec for a given application

CLI Impact or Changes

Impact #1
Impact #2
Impact #3

UI Impact or Changes

Impact #1
Impact #2
Impact #3

Security Impact

What's the impact on Authorization and how does the design take care of this aspect

Impact on Infrastructure Outages

System behavior (if applicable - document impact on downstream [ YARN, HBase etc ] component failures) and how does the design take care of these aspect

Test Scenarios

Test ID	Test Description	Expected Results

Introduction

Goals

User Stories

Design

Representation of the Condition in the Pipeline config

Java API for conditions

Approach

Approach #1

Approach #2

API changes

New Programmatic APIs

Deprecated Programmatic APIs

New REST APIs

Deprecated REST API

CLI Impact or Changes

UI Impact or Changes

Security Impact

Impact on Infrastructure Outages

Test Scenarios

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

Future work

Conditional Execution in Pipelines (WIP)

Introduction

Goals

User Stories

Design

Representation of the Condition in the Pipeline config

Java API for conditions

Approach

Approach #1

Approach #2

API changes

New Programmatic APIs

Deprecated Programmatic APIs

New REST APIs

Deprecated REST API

CLI Impact or Changes

UI Impact or Changes

Security Impact

Impact on Infrastructure Outages

Test Scenarios

Releases

Release X.Y.Z

Release X.Y.Z

Related Work

Future work