Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Streaming pipeline keeps on running for a long time after the ‘Stop’ button is clicked.

  • Logs indicate task failures and RDDs are not found.

  • The Spark Streaming UI indicates that there are many active batches, and batch processing time is greater than the configured batch interval.

    • Navigate to Spark Streaming UI from Dataproc → Clusters → Web Interfaces → YARN ResourceManager → ApplicationMaster under Tracking UI → Streaming

...

Image Modified

Solution(s)

Increase batch interval

...