Sunday 10 February 2019

Adaptive scheduling of Spark Job using YARN API

In last blog poorman spark monitoring i shared approach on how to figure out how long Spark Job is waiting for resource.

This post covers some more details on how to be proactive when Spark Job is stuck due to resource constraint.

Little recap of Yarn.
Apache Yarn is resource management and job scheduling framework for hadoop distributed processing framework.

MapReduce NextGen Architecture

Hadoop cluster are shared between teams and for proper utilization of cluster teams/projects are allocated some capacity of cluster.

One of the popular scheduling approach is "Capacity Scheduler" for multi tenant cluster and it is based on Queues.
Yarn allows to define min & max resource for Queue and it is hierarchical, it looks something like below

Capacity Scheduler


One of the issues that can happen in Capacity Scheduler is that your job is submitted to overloaded queue and it gets stuck in Accepted state for long time although other queues has some capacity which is just left unused.
Another common issue is Job started running but did not got all the resource(cores/memory) and will run forever because queue is overloaded.

Yarn gives REST API to query state of cluster/queues/application and that can be used to solve issues where resource is available in cluster but application is not using it :-)

Yarn API to build adaptive job submission
Yarn API comes very handy in solving both of the above issue, some of the strategy using yarn API.

 - Submit job to queue that has capacity.
This type of strategy will select queue at run-time and submit application to least loaded queue.

 - Move Job to queue that has capacity.
This type of strategy will monitor job status and if it is not moving or get stuck in "Accepted" state then will move it to queue that has some capacity.

Abstraction of Yarn API to get minimum details that will allow adaptive job submission.

Once we get all the metrics required for making decision then it becomes straight forward to submit/move the job.
Below code snippet try to move the job based on simple strategy of max wait time for Accepted status App.

Yarn exposes lots of metrics that can used to building adaptive system. You can refer to ResourceManagerRest for full set of API.

Word of caution that be fair when you are using this strategy, don't use whole cluster alone.
Image result for greedy


Code used in post is available @ yarn github project

No comments:

Post a Comment