Value Mapper

Introduction 

Value Mapper is a type of transform that maps string values of a field in the input record to another value. Mappings are usually stored in another dataset. This provides you a simple alternative for mapping data in the record.

Use-case(s)

If you want to replace language codes in the input record field to language description.

  • Source Field name : language_code
  • Target field name : language_desc
  • Mappings, Source / Target :  DE/German, ES/Spanish, EN/English, etc.

Conditions

  • If source field to be mapped is null, then target field will be null

  • If source field to be mapped is empty, then target field will be empty

  • If source field cannot be mapped, then a default could be assigned and that’s populated in the target field

  • More than one field can be mapped using this transform

  • Source field type can be of only string type

  • Output field type can be of only string type

Options

Following are the options that should be provided for user to configure

  • Configuring the value mapper fields

    • Mapping

      • Source Field Name

      • Dataset

      • Target Field Name

    • Can support one or more such mappings

  • How the source field should be handled

    • If NULL, then user can provide a default value or NULL

    • If EMPTY, then user can provide a default value or EMPTY

  • Output Schema should allow

    • Remove Source Field

    • Include Target Field

Design

 

Implementation strategy to allow user to provide default value for mapping field (default value or empty or null)

Plugin would expect the input with following UI Widget:

 

This will be represented in json format in plugin as: "defaults":"field1:value1, field2:value2" 

Examples

Suppose that user takes the input data( employee details) through the csv file or any other source and  wants to apply value mapper on certain field (field=Designation).

This would be helpful for user to access data in terms of readability.


Source: We are considering the source as CSV file, For Example:

Source

 

Output from Source

Type

Value

S3

 

Id

String

1234

Path: Path on S3

 

Name

String

John

CSV File with fields:

 

Salary

INT

9000

id, name,salary,designation

 

Designation

String

2

 

ValueMapper Plugin :  For this input will be the StructuredRecord from source and Mapping/Lookup Dataset.

The transform function in this plugin will apply the mapping on the source fields using Lookup Interface.

StructuredRecord Format:

Id

String

Name

String

Salary

INT

Designation

String


Sample structure for Mapping/Lookup Dataset

Designation

Value

1

SE

2

SSE

3

ML

 

Sink : After the transformations from ValueMapper plugin, output will have below structure:

FieldName

Type

Value

Id

String

1234

Name

String

John

Salary

INT

9000

Designation

String

SSE


Properties:

  • source: name of the field that contains the input column name to be mapped. This field should be removed from the output schema.

  • target: name of the field that contains the output column name that will be mapped. This field should be included in output schema.
  • lookup table: The configuration of the lookup tables to be used in your transform as mapping dataset. For example, if lookup table "designation" is configured, then you will be able to perform operations with that lookup table in your transform. Currently supports KeyValueTable.

  • User would provide these properties in input text field as triplet (as shown below)
  • User would provide default values for mapping fields. Json representation is provided in below example

Example:
         {
             "name": "ValueMapper",
             "type": "transform",
             "properties": {
                     "mapping": "source1:LookupTableName:target1,source2:LookupTable2:target2",

                      "defaults":"field1:value1,field2:value2"                   

             }
        }

 

The transform takes record that have "id,name,salary,designationID" fields, maps the value of designationID field to generate designationName field based on the mapping dataset and then returns a record containing "id,name,salary,designationName" fields.

For example, if the input is like:

field nametype value 
id string US1
name string samuel
salary string 1000
designationID string D3

it will transform it to this output record:

field nametype value 
id string US1
name string samuel
salary string 1000
designationName string Module Lead

Table of Contents

Checklist

  • User stories documented 
  • User stories reviewed 
  • Design documented 
  • Design reviewed 
  • Feature merged 
  • Examples and guides 
  • Integration tests 
  • Documentation for feature 
  • Short video demonstrating the feature