...
Streaming pipeline keeps on running for a long time after the ‘Stop’ button is clicked.
Logs indicate task failures and RDDs are not found.
The Spark Streaming UI indicates that there are many active batches, and batch processing time is greater than the configured batch interval.
Navigate to Spark Streaming UI from Dataproc → Clusters → Web Interfaces → YARN ResourceManager → ApplicationMaster under Tracking UI → Streaming
...
Solution(s)
Increase batch interval
...