Introduction

A rules engine transform will apply predefined set of rules to incoming data (realtime as well as batch) . Rules must be generic enough to allow updates to a dataset, posting to an HTTP endpoint, sending an email, etcbased on the field values in each record.

Use case(s)

CompanyA wants to develop a streaming pipeline to read and process signals from wearable and non-wearable devices, and apply rules on incoming signals. Based on the rules, it wants to send notifications to configured mobile devices to provide concierge and/or healthcare services. . A rules engine would be leveraged to determine which notifications to send based on the incoming signal.
CompanyA has a rules management system that can allow users each user to feed in a unique set of rules for deviceseach device. Rules can state actions to be taken if certain conditions are met in the signals from the provided devices. These rules are stored in a CDAP dataset. A streaming pipeline will then read the rules dataset and apply rules applicable for incoming signals to trigger appropriate notificationsand can be retrieved and edited by the user at any time. A long running streaming pipeline would refresh the rules automatically, and begin applying them to incoming signals as needed.
CompanyA would like a selection criteria for applying rules to incoming signals based on 3 different fields in the message. If there are no rules for field 1 found, then the pipeline would need to lookup based on the second field, and then the third field. Selection criteria could be based on device hierarchy, but could also be arbitrary. The company would like to configure the rule selection criteria in the plugin to account for this hierarchy.

User Storie(s)

User story #1
User story #2
User story #3
User story #m

Plugin Type

Batch Source
Batch Sink
Real-time Source
Real-time Sink
Action
Post-Run Action
Aggregate
Join
Spark Model
Spark Compute
Transform

Configurables

As a developer, I would like a way to define input field level transformations based on criteria such as field1 > 10 && field2 = "phone" where the output of the rule is an update to an existing field or the creation of a new field in the output record. (rule)
As a developer, I would like a way for users to add/remove/edit rules that should be applied to data in the system. (rule management)
As a developer, I would like to group those rules together using a common set of lookup criteria, such as field1=2343243,field2=device-name,field3=phone. (ruleset)
As a Hydrator user, I would like to apply rulesets on the incoming messages based on fields from the incoming data. (ruleset criteria)
As a Hydrator user, I would like to be able to look up rules from a rules repository (rule management)
As a Hydrator user, I would like to apply only those rules that are applicable to the incoming record. For achieving this, I would like to specify a selection criteria for rules. The selection criteria may be based on input schema. (ruleset criteria)

Plugin Type

Transform

Configurable Parameters

This section defines properties that are configurable for this plugin.

User Facing Name	Type	Description	Constraints
Input Schema	String	Schema of the input data	Specified as a CDAP schema
Ruleset Repository	String	Name of the dataset to be used as a repository of rules
Ruleset Selection Criteria		String	Using some grammar, this string specifies the logic of selecting a rule from the rules dataset
Output Schema			String	Schema of the output data, containing the fields containing rules outputs	Specified as a CDAP schema

Design / Implementation Tips

Tip #1
Tip #2

Design

Approach(s)

Properties

The format of the rules can be in compliance with Jexl http://commons.apache.org/proper/commons-jexl/
Rules in the rule-sets should be added in the order of their priority.
Field names in the rules should be populated automatically by parsing the rule

Design

Image Added

The output fields will be populated based on the output of the application of rules on one or many fields in the input schema.

Rules DB

Id	Rule	description	Fields used in the rules	Output Fields/Types
1	if (Age > 18) { Age_group = Adult; } else { Age_group = Child; }	Category rule to classify between age groups	Age	Age_group:string

Rule set DB

Rule set Id

List of rules

description

Criteria

Fields Used in Criteria

34

1

2

Rule set to classify target customer groups

f1=12,f2=company

f1,f2

Approach(s)

We will Set up a rule-set db in Tracker where a user can go and add rules with the specified rule set conditions
If user does not select any rule set, The output will be the result of one to one mapping of the input and the output schema
User will be able to reference the rule-set db from within the plugin to assign rule sets
Rule-sets could be applied in the hierarchal manner such that, If no rule set exists for Field A, lookup rules for for Field B
Post actions could be taken based on the values in the output.

Properties

Rule-set DB: Name of the data set to be used as the rules repository, If left blank the default Tracker rules dataset will be used.
LookUp fields: Map of out put fields and corresponding list of fields based on which the rule - sets are to be applied. Multiple rules can be applied for single field.
Hierarchal: Boolean (checkbox) to specify that the rule set lookup should be hierarchal or not.

Security

Limitation(s)

Point lookups are not currently supported by spark streaming

Future Work

Some future work – HYDRATOR-99999
Another future work – HYDRATOR-99999

Test Case(s)

Test case #1

Test case #2

User is able to apply multiple rule sets.
Output data is being filtered according to rules
Not configuring any rule results in one to one mapping

Sample Pipeline

Please attach one or more sample pipeline(s) and associated data.

Pipeline #1

Pipeline #2

Table of Contents

Table of Contents

style	circle

Checklist

User stories documented
User stories reviewed
Design documented
Design reviewed
Feature merged
Examples and guides
Integration tests
Documentation for feature
Short video demonstrating the feature

Versions Compared

Old Version 2

New Version Current

Key

Introduction

Use case(s)

User Storie(s)

Plugin Type

Plugin Type

Configurable Parameters

Design / Implementation Tips

Design

Approach(s)

The format of the rules can be in compliance with Jexl http://commons.apache.org/proper/commons-jexl/
Rules in the rule-sets should be added in the order of their priority.
Field names in the rules should be populated automatically by parsing the rule

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

User is able to apply multiple rule sets.
Output data is being filtered according to rules
Not configuring any rule results in one to one mapping

Sample Pipeline

Pipeline #1

Page Comparison

Versions Compared

Old Version 2

New Version Current

Key

Introduction

Use case(s)

User Storie(s)

Plugin Type

Plugin Type

Configurable Parameters

Design / Implementation Tips

Design

Approach(s)

The format of the rules can be in compliance with Jexl http://commons.apache.org/proper/commons-jexl/Rules in the rule-sets should be added in the order of their priority.Field names in the rules should be populated automatically by parsing the rule

Design

Approach(s)

Properties

Security

Limitation(s)

Future Work

Test Case(s)

User is able to apply multiple rule sets.Output data is being filtered according to rulesNot configuring any rule results in one to one mapping

Sample Pipeline

Pipeline #1

The format of the rules can be in compliance with Jexl http://commons.apache.org/proper/commons-jexl/
Rules in the rule-sets should be added in the order of their priority.
Field names in the rules should be populated automatically by parsing the rule

User is able to apply multiple rule sets.
Output data is being filtered according to rules
Not configuring any rule results in one to one mapping