Problem
Customers are seeing the following issue when running pipelines.
...
The pipelines will keep running for a long time and it seems like they never finish.
Pipelines Pipeline metrics keep resetting, indicating that the jobs are reprocessing.
Logs indicate that Spark is not able to fit RDD in memory.
False message that RDD is being persisted to disk.
...
Navigate to the pipeline detail page.
In the Configure menu, click on Engine config.
Enter '
spark.cdap.pipeline.autocache.enable
' as the key, and 'false' as the value.
...
Page Properties | ||
---|---|---|
| ||
|