Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Problem

Pipelines fail with the following error in the log:

io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Insufficient 'DISKS_TOTAL_GB' quota. Requested 3000.0, available 2048.0

This error means that the Dataproc cluster provisioned by your pipeline would cause it to exceed the GCE quota for compute disks. Since the Dataproc cluster cannot be provisioned, the pipeline fails.

Solution(s)

There are two ways to resolve this issue: raise your project quota, or configure Dataproc disk sizes.

Raise your project quota

This quota that must be raised for this error is Persistent disk standard (GB). There are both project wide and regional quotas. You can see that documentation here for more information as well as steps on how to raise it: https://cloud.google.com/compute/quotas

Configure Dataproc disk sizes

The size of the Dataproc cluster can be configured through the use of cluster properties in order to keep it under quota. The defaults can be overridden by adding runtime arguments to the pipeline as described in Setting custom Dataproc cluster properties.

In this case, the relevant properties are:

system.profile.properties.masterDiskGB

system.profile.properties.workerDiskGB

Set these to a low enough value (in Gb) so that the resultant dataproc cluster remains under quota.

  • No labels