Avro

This topic is posted on the CDAP doc wiki. It’s maintained there: Working with Avro Sources

Overview

This is a collection of best practices for wrangling Avro files.

Recommendation

  • Use ‘File’ source connector to always read Avro files. Highly recommended.

General Tips

  • Avro Data File parsing (parse-as-avro-file) is limited to parsing 5 MB.

  • For files over 5 MB, we recommend using File source connector with format set to avro.

  • Because an Avro file is one large file that contains many records, parsing an Avro file in Wrangler can take some time. Do not add any transformation steps to the recipe until parsing is finished. If you add transformation steps before parsing completes and the transformation fails at any point, the parsing process will not complete successfully.

 

 

Â