This page lists Dataproc error messages, and their common causes and solutions.
For additional guidance, see
Cluster creation error messages
Operation timed out: Only 0 out of 2 minimum required datanodes/node managers running.
Cause: The master node is unable to create the cluster because it cannot communicate with worker nodes.
Solution:
- Check firewall rule warnings.
- Make sure the correct firewall rules are in place (see Overview of the default Dataproc firewall rules).
- Perform a connectivity test in the Google Cloud console to determine what is blocking communication between the master and worker nodes.
Required 'compute.subnetworks.use' permission for 'projects/{projectId}/regions/{region}/subnetworks/{subnetwork}
Cause: This error can occur when you attempt to setup a Dataproc cluster using a VPC network in another project and the Dataproc Service Agent service account does not have the necessary permissions on the shared VPC project that is hosting the network.
Solution: Follow the steps listed in Create a cluster that uses a VPC network in another project.
The zone 'projects/zones/{zone}' does not have enough resources available to fulfill the request '(resource type:compute)'
Cause: The zone being used to create the cluster does not have sufficient resources.
Solution:
- Create the cluster in a different zone.
- Use the Dataproc Auto Zone placement feature.
Quota Exceeded errors
Insufficient CPUS/CPUS_ALL_REGIONS quota Insufficient 'DISKS_TOTAL_GB' quota Insufficient 'IN_USE_ADDRESSES' quota
Cause: Your CPU, disk, or IP address request exceeds your available quota.
Solution: Request additional quota from the Google Cloud console.
Initialization action failed
Cause: The initialization action provided during cluster creation failed to install.
Solution:
- See initialization actions considerations and guidelines.
- Examine the output logs. The error message should provide a link to the logs in Cloud Storage.
Failed to initialize node {cluster-name}: {component}
Cause: A Dataproc component failed to initialize.
Solution: Refer to:
Cluster creation failed: IP address space exhausted
Cause: IP address space needed to provision the requested cluster nodes is unavailable.
Solution:
- Create a cluster on a different subnetwork or network.
- Reduce usage on the network to free IP address space.
- Wait until sufficient IP space becomes available on the network.
Initialization script error message: The repository REPO_NAME no longer has a Release file
Cause: The Debian oldstable backports repository was purged.
Solution:
Add the following code before the code that runs
apt-get
in your initialization script.oldstable=$(curl -s https://deb.debian.org/debian/dists/oldstable/Release | awk '/^Codename/ {print $2}'); stable=$(curl -s https://deb.debian.org/debian/dists/stable/Release | awk '/^Codename/ {print $2}'); matched_files="$(grep -rsil '\-backports' /etc/apt/sources.list*)" if [[ -n "$matched_files" ]]; then for filename in "$matched_files"; do grep -e "$oldstable-backports" -e "$stable-backports" "$filename" || \ sed -i -e 's/^.*-backports.*$//' "$filename" done fi