Fully Parameterized Merge Queries on BQ

This article is posted on the CDAP Doc wiki and will be maintained here:

Users want to create dynamic pipelines for greater reusability and ease of operations. This guide walks users through how to create a dynamic pipeline to use for Merge use cases. For context on dynamic pipelines, please refer to this article .

Background

Data Fusion provides the BQ Execute action plugin that can be used to execute Merge queries. To create fully dynamic Merge queries, there are different aspects that need to be parameterized. For example:

  • Merge Query

  • Dataset Name

  • Source Table

  • Target Table

  • Join Keys

  • Fields

Data Fusion supports recursive macros that can be used to fully parameterize Merge queries that will require a nested macro.

Solution

  1. Parameterize the BQ Execute action with a single parameter for the SQL query.

 

2. Construct the Merge query using parameters:

MERGE ${dataset}.${target_table} A USING ${dataset}.${source_table} H ON A.${join_key} = H.${join_key} WHEN NOT MATCHED THEN INSERT (${fields}) VALUES (${fields})

3. Pass in the parameters required during runtime:

 

Â