Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

This document is a collection of best practices for Wrangler CSV parsing and cleansing of CSV files.

General Tips

  • Parse CSV parse-as-csv should avoid using automatic header determination (parse-as-csv :col ‘\t’ false). On large files that are distributed across multiple partitions and header is no available to be set in different partitions.

Table of Contents