Macro Substitution
Previously, the configurations of plugins in an ETL pipeline could not be changed after the pipeline's deployment. Macro substitution provides pipeline developers and operators the ability to configure plugin settings on a run-to-run basis for settings that may be unknown at the time of configuration.
Specifying Substitutions
Macro substitution provides two types of substitutions through property lookups and macro functions. Property lookups are specified through key-value pairs. There are two ways to specify these key-value pairs.
- Set a key-value pair in the runtime arguments or preferences for the physical pipeline.
- A Custom Action or Hydrator Action can be run in the first stage of a pipeline and set a key-value argument through a workflow token.
In order to enable a plugin property as macro-substitutable, the property must be both annotated as macro-enabled and provided proper macro syntax at configuration time.
Plugin Config Annotation
In order to enable macro substitution for a property, use the @Macro annotation on the property field in a plugin's configuration class. As such, macros are disabled for all fields by default. For example:
public class TableSinkConfig extends PluginConfig { @Name(Properties.Table.NAME) @Description("Name of the table. If the table does not already exist, one will be created.") // The name of the table can be specified by a runtime macro @Macro private String name; @Name(Properties.Table.PROPERTY_SCHEMA) @Description("schema of the table as a JSON Object. If the table does not already exist, one will be " + "created with this schema, which will allow the table to be explored through Hive. If no schema is given, the " + "table created will not be explorable.") @Nullable private String schemaStr; @Name(Properties.Table.PROPERTY_SCHEMA_ROW_FIELD) @Description("The name of the record field that should be used as the row key when writing to the table.") private String rowField; }
Syntax
In addition to a plugin property being annotated with @Macro, proper macro syntax must be provided to the property field at configure time. There are two valid macro syntaxes, property lookups and macro functions.
Property Lookup
Macro property lookups are simple key-value substitutions that use the following syntax:
${macro-name}
At runtime, the syntax ${macro-name} will be replaced with whatever value was specified for the key "macro-name." For instance, you might not know the name of a source stream until runtime. You could use, in the source stream's Stream Name configuration:
${source-stream-name}
and in the runtime arguments (or preferences) set a key-value pair such as:
source-stream-name: myDemoStream
Macros can be referential. You might have a server that refers to a hostname and port, and specify this substitution:
server-address: ${hostname}:${port}
and these runtime arguments:
hostname: my-demo-host.example.com port: 9991
In a pipeline configuration, you could configure a property with:
${server-address}
expecting that it would be replaced with:
my-demo-host.example.com:9991
Macro Function
Macro functions allow more complex logic to be run before a substitution occurs and use the following syntax:
${macroFunction(arg1,arg2,arg3)}
At runtime, the "macroFunction" function will perform some computation with the provided arguments: arg1, arg2, and arg3. Note that whitespace is significant between arguments. The syntax will be replaced with whatever macroFunction evaluates to given the provided arguments.
Currently, there are two support macro functions, logicalStartTime and secureStore.
logicalStartTime
${logicalStartTime(timeFormat,offset)}
The logicalStartTime macro function takes in a time format and an optional offset as arguments and uses the logical start time of a pipeline to perform the substitution. For example, if a pipeline starts on January 1, 2016 at midnight and the following syntax is provided:
${logicalStartTime(yyyy-MM-dd'T'HH-mm-ss,1d-4h)}
would be substituted with the logical start time: 2016-01-01