Parsing and exploding JSON arrays in Wrangler
This content is now maintained on the CDAP doc wiki, here. Please add any comments to the CDAP wiki.
This page describes how to parse JSONs in Wrangler. If the JSON has arrays, this will be a short tutorial to explain how the arrays can be exploded/flattened into columns for further processing and cleanup.
Parsing JSON
Steps to follow:
Go to CDF instance.
Navigate to Wrangler from the sidebar (appears by clicking the hamburger icon in the navbar).
From the a Wrangler source (GCS or BigQuery), read a JSON file to Wrangle.
Once navigated to Wrangler tab, open the dropdown on the column and choose Parse → JSON.
After Step 4, the fields in the JSON will form the column. Identify the column that has rows.
Open the drop-down from that specific column and select Explode → Array (by flattening).
This will explode the elements in the JSON array as individual data in the same column in Wrangler.
After step 7 further directives can be applied based on cleanup needed.
Â
Â