H-Quincy: Fair Scheduling for Hadoop Clusters

H-Quincy implements the paper Quincy: Fair Scheduling for Distributed Computing Clusters on Hadoop, which improves the mapreduce scheduler by replacing the default queue-based one with a flow-based one. A min cost flow is calculated and updated to assign map tasks among the cluster, according to the size of the data split and the communication overhead in the cluster’s hierarchy.

Install

git clone https://github.com/puncsky/H-Quincy.git


You can either build from source code or user the JAR directly.

• Build from Source Code. Replace your $HADOOP_HOME/src/mapred/org/apache/hadoop/mapred with files in src/. Enter $HADOOP_HOME and build with ant.

3.4 Preemption and Fairness

There exist four versions of quincy.

• Quincy without Preemption and without Fairness(Q).
• Quincy with Preemption and without Fairness(QP).
• Quincy without Preemption and with Fairness(QF).
• Quincy with Preemption and with Fairness(QPF).

Limited to the time, our current implementation does not include preemption and fairness. Preemption is easy to achieve but there are more classes and source codes to modify for fairness control.