Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Set the engine config spark.streaming.blockInterval to 30000 (30 seconds). This configuration has to be applied when a realtime pipeline has a GCS sink. This will reduce the number of part files created in GCS sink.

  2. Set a runtime argument system.resources.reserved.memory.override to 1024 to reserve 1 GB of memory overhead for the Spark process to avoid YARN killing.

...