Skip to main content

Table 1 Symbol and explanation of all parameters

From: A genetic algorithm-based job scheduling model for big data analytics

Type

Symbol

Explanation

Cluster

DC i

the ith data center i∈[1,N dcs ]

 

B ii

Bandwidth between nodes in DC i

 

V dw

Speed of writing data to the local disk

Hadoop&HDFS

P i

Partition size

 

N sm

Number of simultaneous maps executed

  

in one node

 

N cr

Number of simultaneous reduces executed

  

in one node

 

\(N_{\text {cp\_threads}}\)

Number of i/o threads copy to one reduce

  

node

 

\(V_{\text {cp\_thread}}\)

Theoretical maximum copy speed of one

  

copy thread

 

\(V_{\text {reduce\_rep}}\)

Theoretical maximum output replication

  

speed of one copy thread

 

N Spaths

Number of sort paths for copy

 

N reps

Number of replicas in HDFS

 

S buff

Sort buffer size for copy

App

DS i

Input data size in the ith data center

 

N p

Number of partitions

 

N reduces

Number of reduces

 

M thruput

Average map throughput of each node

 

R thruput

Average reduce throughput of each node

 

RIOmap

Ratio of map output to input size

 

RIOreduce

Ratio of reduce output to input size

Module

T total

Total execution time

 

T prepare

Total execution time for raw data input into

  

HDFS

 

T job

Total execution time for a job

 

T map

Time for a map wave

 

T copy

Time for a copy wave

 

T sort

Time for a sort phase

 

T reduce

Time for a reduce phase

 

T rp

Time for reduce processing

 

T ro

Time for reduce output writing

 

N mw

Number of map waves