CPU,  Hadoop,  YARN

YARN – Negative vCores – Capacity Scheduler with Memory Resource Type

You can expect that the total number of vCores available to YARN limits the number of containers you can run concurrently, that’s not true in some cases.

Let’s consider one of them – Capacity Scheduler with DefaultResourceCalculator (Memory only).

Capacity Scheduler

Capacity Scheduler is the default scheduler for managing the YARN resources. You can check yarn.resourcemanager.scheduler.class in /etc/hadoop/conf/yarn-site.xml to see whether another scheduler is used.

By default, Capacity Scheduler uses DefaultResourceCalculator that only uses Memory resources and ignores CPU resources.

Check /etc/hadoop/conf/capacity-scheduler.xml:

  <property>
    <name>yarn.scheduler.capacity.resource-calculator</name>
    <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
  </property>

In some Hadoop versions you can check the YARN Resource Manager UI to quickly see which scheduler is used (http://rm_ipaddress:8088/cluster/apps):


Negative vCores

So with these default settings let’s see how we can get negative vCores.

Consider a cluster with the compute nodes having 16 vCores and 122 GB memory (for example, r4.4xlarge Amazon EC2 instance), and 114 GB memory available for YARN.

An user can submit a Hive SQL query requesting a lot of 1.5 GB containers, and during its execution you can observe the following information in the YARN Resource Manager UI for the cluster nodes (http://rm_ipaddress:8088/cluster/nodes):

You can see that the Capacity Scheduler with the default Memory-only resource calculator can allocate containers until the YARN nodes have enough memory to run them driving vCores to negative numbers since a single container requires at least 1 vCore (defined by yarn.scheduler.minimum-allocation-vcores).

It shows that in such configuration the number of vCores available to YARN is not relevant. See, also AWS EC2 vCPU and YARN vCores – M4, C4, R4 Instances.