Versions Compared
Key
- This line was added.
- This line was removed.
- Formatting was changed.
Overview
This article documents recommended configurations for running pipelines against a static Dataproc cluster. As an additional note, please refer to this article on how to Run pipelines against existing Dataproc clusters
General Tips
Set the following configurations while creating a static Dataproc cluster to run pipelines.
yarn.nodemanager.delete.debug-delay-sec - This is the configuration to retain YARN logs. Recommended value 86400 (which is 1 day)
yarn.nodemanager.pmem-check-enabled - This configuration enables YARN to check for physical memory limit and kill containers if they go beyond physical memory. Recommended value false
yarn.nodemanager.vmem-check-enabled - This configuration enables YARN to check for virtual memory limit and kill containers if they go beyond physical memory. Recommended value false.
These configurations can be set by clicking on Add Cluster Property while creating the cluster from cloud console.
Table of Contents |
---|