From: A genetic algorithm-based job scheduling model for big data analytics
Type | Symbol | Explanation |
---|---|---|
Cluster | DC i | the ith data center i∈[1,N dcs ] |
B ii | Bandwidth between nodes in DC i | |
V dw | Speed of writing data to the local disk | |
Hadoop&HDFS | P i | Partition size |
N sm | Number of simultaneous maps executed | |
in one node | ||
N cr | Number of simultaneous reduces executed | |
in one node | ||
\(N_{\text {cp\_threads}}\) | Number of i/o threads copy to one reduce | |
node | ||
\(V_{\text {cp\_thread}}\) | Theoretical maximum copy speed of one | |
copy thread | ||
\(V_{\text {reduce\_rep}}\) | Theoretical maximum output replication | |
speed of one copy thread | ||
N Spaths | Number of sort paths for copy | |
N reps | Number of replicas in HDFS | |
S buff | Sort buffer size for copy | |
App | DS i | Input data size in the ith data center |
N p | Number of partitions | |
N reduces | Number of reduces | |
M thruput | Average map throughput of each node | |
R thruput | Average reduce throughput of each node | |
RIOmap | Ratio of map output to input size | |
RIOreduce | Ratio of reduce output to input size | |
Module | T total | Total execution time |
T prepare | Total execution time for raw data input into | |
HDFS | ||
T job | Total execution time for a job | |
T map | Time for a map wave | |
T copy | Time for a copy wave | |
T sort | Time for a sort phase | |
T reduce | Time for a reduce phase | |
T rp | Time for reduce processing | |
T ro | Time for reduce output writing | |
N mw | Number of map waves |