Hadoop MapReduce Job Scheduling Algorithms Survey and Use Cases


  •  Alaa A. Abdallat    
  •  Arwa I. Alahmad    
  •  Duaa A. AlSahebAlT amimi    
  •  Jaber A. AlWidian    

Abstract

Data is the fastest growing asset in the 21st century, extracting insights is becoming of the essence as the traditional ecosystems are incapable to process the resulting amounts, complying with different structural levels, and is rapidly produced. Along this paradigm, the need for processing mostly real time data among other factors highlights the need for optimized Job Scheduling Algorithms, which is the interest of this paper. It is one of the most important aspects to guarantee an efficient processing ecosystem with minimal execution time, while exploiting the available resources taking into consideration granting all the users a fair share of the dedicated resources. Through this work, we lay some needed background on the Hadoop MapReduce framework. We run a comparative analysis on different algorithms that are classified on different criteria. The light is shed on different classifications: Cluster Environment, Job Allocation Strategy, Optimization Strategy, and Metrics of Quality. We, also, construct use cases to showcase the characteristics of selected Job Scheduling Algorithms, then we present a comparative display featuring the details for the use cases.



This work is licensed under a Creative Commons Attribution 4.0 License.