Release A Reference Architecture for Datacenter Scheduling · atlarge-research/opendc-simulator

This release contains the software artifacts of the paper A Reference Architecture for Datacenter Scheduling presented at Supercomputing 2018

For the paper, experiments have been run on the following traces:

Askalon (W-Eng) - askalon_workload_ee
Chronos (W-Ind) - chronos_exp_noscaler_ca

Each of the directories for the traces have the following structure:

/setup.txt
This text file describes the trace used for the experiment in addition to the
amount of times the experiment was repeated and the amount of warm-up
experiments.
/setup.json
This JSON file describes the topology of the datacenter used in the
experiments. Each item represents the identifiers of the
resource (here, CPU type) to use in the machine. The available CPU types
are (1) Intel i7 (4 cores, 4100 MHz) and (2) Intel i5 (2 cores, 3500 MHz).
/trace
This directory contains the trace used in the simulation. The trace is stored
in the Grid Workload Format. See the Grid Workload Archive for more information.
/data/experiments.csv
A CSV file containing information of all simulations that have been run
on the OpenDC platform for this experiment.
/data/job_metrics.csv
A CSV file containing metrics (NSL, JMS, etc.) for each job that ran
during the simulations.
/data/stage_measurements.csv
A CSV file containing timing measurements for the scheduling stages that
ran during the simulations.
/data/task_metrics.csv
A CSV file containing metrics for each task that ran during the simulations.
/data/tasks.csv
A CSV file containing information about the tasks (submit time, runtime, etc.) that ran during the
simulations as extracted from the traces.

Additionally, we describe the format of each data file in the associated
metadata file.

Hardware

The hardware used for running the experiments is a MacBook Pro with
a 2,9 GHz Intel Core i7 processor and 16 GB 2133 MHz LPDDR3 internal memory.

Reproduction

This section describes the instructions for reproducing the paper results using
a provided Docker image. Please make sure you have Docker installed
and running.

For reproduction, you will run the following experiments:

askalon_workload_ee
This is the large experiment of the paper and will take approximately 4 hours
to complete similar hardware.
chronos_exp_noscaler_ca
This is the smaller experiment of the paper and will take approximately 5
minutes to complete on similar hardware.

The Docker image atlargeresearch/sc18-experiment-runner can be used for running
the experiments. A volume can be attached to the directory
/home/gradle/simulator/data to capture the results of the experiments.

Make sure you have, in your current working directory, the following files:

/setup.json
This JSON file describes the topology of the datacenter and can be found in
this archive at askalon_workload_ee/setup.json.
/askalon_workload_ee.gwf
This file contains the trace for the Askalon workload. This file can be found
in the archive at askalon_workload_ee/trace/askalon_workload_ee.gwf.
/chronos_exp_noscaler_ca.gwf
This file contains the trace for the Chronos workload. This file can be found
in the archive at chronos_exp_noscaler_ca/trace/chronos_exp_noscaler_ca.gwf.

Then, you can start the Askalon experiments as follows:

$ docker run -it --rm -v $(pwd):/home/gradle/simulator/data atlargeresearch/sc18-experiment-runner -r 32 -w 4 -s data/setup.json data/askalon_workload_ee.gwf

The experiment runner can be configured with the following options

-r, --repeat
The amount of times to repeat an experiment for each scheduler.
-w, --warm-up
The amount of times to warm-up the simulator for each scheduler.
-p, --parallelism
The number of experiments to run in parallel.
--schedulers
The list of schedulers to test, separated by spaces. The following schedulers
are available: SRTF-BESTFIT, SRTF-FIRSTFIT, SRTF-WORSTFIT,
FIFO-BESTFIT, FIFO-FIRSTFIT, FIFO-WORSTFIT, RANDOM-BESTFIT,
RANDOM-FIRSTFIT, RANDOM-WORSTFIT.

After the Askalon experiments have been finished, you can start the Chronos
experiments. Make sure you have a copy of the result files in your directory as the
result files will be overwritten.

$ docker run -it --rm -v $(pwd):/home/gradle/simulator/data atlargeresearch/sc18-experiment-runner -r 32 -w 4 -s data/setup.json data/chronos_exp_noscaler_ca.gwf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Reference Architecture for Datacenter Scheduling

Hardware

Reproduction