CPU – Large-Scale Data Engineering in Cloud

CPU, Hadoop, YARN

YARN – Negative vCores – Capacity Scheduler with Memory Resource Type

May 8, 2020

You can expect that the total number of vCores available to YARN limits the number of containers you can run concurrently, that’s not true in some cases.

Let’s consider one of them – Capacity Scheduler with DefaultResourceCalculator (Memory only).

Read More

dmtolpeko
AWS, CPU, EC2, EMR, Hadoop, Qubole, YARN

AWS EC2 vCPU and YARN vCores – M4, C4, R4 Instances

May 7, 2020

Let’s review how EC2 vCPUs correspond to YARN vCores in Amazon EMR and Qubole Hadoop clusters. As an example, I will choose m4.4xlarge, r4.4xlarge and c4.4xlarge EC2 instance types.

EC2 vCPU is a thread of a CPU core (typically, there are two threads per core). Does it mean that YARN vCores should be equal to the number of EC2 vCPU? That’s not always the case.

Read More

dmtolpeko