[ Back to index ]
Click here to see the table of contents.
- Tutorial: running the MLPerf inference benchmark and preparing the submission
- Introduction
- System preparation
- Minimal system requirements
- CM installation
- Pull CM repository with cross-platform MLOps and DevOps scripts
- Optional: update CM and repository to the latest version
- Install system dependencies for your platform
- Use CM to detect or install Python 3.8+
- Install Python virtual environment with above Python
- Customize and run the MLPerf inference benchmark
- Debug the MLPerf benchmark
- Customize MLPerf benchmark
- Prepare submission
- The next steps
- Authors
- Acknowledgments
This tutorial briefly explains how to run a modular version of the MLPerf inference benchmark using the cross-platform automation meta-framework (MLCommons CM aka CK2) with a simple GUI and prepare your submission.
Please follow this CM tutorial from the Student Cluster Competition for more details.
If you have questions, encounter issues or have feature requests, please submit them here and feel free to join our open taskforce on automation and reproducibility and Discord discussions.*
- Device: CPU (x86-64 or Arm64) or GPU (Nvidia)
- OS: we have tested CM automations on Ubuntu 20.04, Ubuntu 22.04, Debian 10, Red Hat 9 and MacOS 13
- Disk space:
- test runs: minimal preprocessed datasets < ~5GB
- otherwise depends on a task and a dataset. Sometimes require 0.3 .. 3TB
- Python: 3.8+
- All other dependencies (artifacts and tools) will be installed by the CM meta-framework
Follow this guide to install the MLCommons CM framework (CK2) on your system.
After the installation, you should be able to access the CM command line as follows:
$ cm
cm {action} {automation} {artifact(s)} {--flags} @input.yaml @input.json
Pull stable MLCommons CM repository with cross-platform CM scripts for modular ML Systems:
cm pull repo mlcommons@ck
CM pulls all such repositories into the $HOME/CM
directory to search for CM automations and artifacts.
You can find the location of a pulled repository as follows:
cm find repo mlcommons@ck
You can also pull a stable version of this CM repository using some checkout:
cm pull repo mlcommons@ck --checkout=...
You can now use the unified CM CLI/API of reusable and cross-platform CM scripts) to detect or install all artifacts (tools, models, datasets, libraries, etc) required for a given software project (MLPerf inference benchmark in our case).
Conceptually, these scripts take some environment variables and files as an input, perform a cross-platform action (detect artifact, download files, install tools), prepare new environment variables and cache output if needed.
Note that CM can automatically detect or install all dependencies for a given benchmark and run it on a given platform in just one command using a simple JSON or YAML description of dependencies on all required CM scripts.
However, since the goal of this tutorial is to explain you how we modularize MLPerf and any other benchmark, we will show you all individual CM commands to prepare and run the MLPerf inference benchmark. You can reuse these commands in your own projects thus providing a common interface for research projects.
In the end, we will also show you how to run MLPerf benchmark in one command from scratch.
Note that if you already have CM and mlcommons@ck repository installed on your system, you can update them to the latest version at any time and clean the CM cache as follows:
python3 -m pip install cmind -U
cm pull repo mlcommons@ck --checkout=master
cm rm cache -f
We suggest you to install system dependencies required by the MLPerf inference benchmark using CM (requires SUDO access).
For this purpose, we have created a cross-platform CM script that will automatically install such dependencies based on your OS (Ubuntu, Debian, Red Hat, MacOS ...).
In this case, CM script serves simply as a wrapper with a unified and cross-platform interface for native scripts that you can find and extend here if some dependencies are missing on your machine - this is a collaborative way to make CM scripts portable and interoperable.
You can run this CM scripts as follows (note that you may be asked for a SUDO password on your platform):
cm run script "get sys-utils-cm" --quiet
If you think that you have all system dependencies installed,
you can run this script with a --skip
flag:
cm run script "get sys-utils-cm" --skip
Since we use Python reference implementation of the MLPerf inference benchmark (unoptimized), we need to detect or install Python 3.8+ (MLPerf requirement).
You need to detect it using the following CM script:
cm run script "get python" --version_min=3.8
cm run script "install python-venv" --name=mlperf --version_min=3.8
You can change the name of your virtual Python environment using --name
flag.
You can use this online GUI to generate CM commands to customize and run the MLPerf inference benchmark. You can select different implementations, models, data sets, frameworks and parameters and then copy/paste the final commands to your shell to run MLPerf.
Alternatively, you can use your own local GUI to run this benchmark as follows:
cm run script --tags=gui \
--script="app generic mlperf inference" \
--prefix="gnome-terminal --"
You may just need to substitute gnome-terminal --
with a command line that opens a new shell on your OS.
CM will attempt to automatically detect or download and install the default versions of all required ML components.
You can add flag --debug
to CM command to let CM stop just before running a given MLPerf benchmark, open a shell
and let you run/customize benchmark manually from command line while reusing environment variables and tools prepared by CM.
The community provided a unified CM API for the following implementations of the MLPerf inference benchmark:
- Python reference implementation (CPU and CUDA)
- See the current coverage here and please help us test different combinations of models, frameworks and platforms (i.e. collaborative design space exploration)!
- Universal C++ implementation (CPU and CUDA)
- Check our community projects to extend this and other implementations.
- TFLite C++ implementation (CPU)
- Nvidia's implementation (CPU and CUDA)
We are also working on a light-weight universal script to benchmark performance of any ML model with MLPerf loadgen without accuracy.
If you want to add your own implementation or backend, the simplest solution is to create a fork of the
MLPerf inference GitHub repo,
specify this repo in the above GUI in the fields Git URL for MLPerf inference sources to build LoadGen
and Git URL for MLPerf inference sources to run benchmarks
and update the CM meta description of our MLPerf wrapper.
Don't hesitate to get in touch with this taksforce to get free help from the community to add your implementation and prepare the submission.
We have tested out-of-the-box CM automation for the MLPerf inference benchmark across diverse x86-64-based platforms (Intel and AMD) as well as Arm64-based machines from RPi4 to AWS Graviton.
As a minimum requirement, you should have CUDA installed. It can be detected using CM as follows:
cm run script "get cuda"
We suggest you to install cuDNN and TensorRT too.
If it's not installed, you can use CM scripts to install them as follows:
cm run script --tags=get,cudnn --tar_file=<PATH_TO_CUDNN_TAR_FILE>
cm run script --tags=get,tensorrt --tar_file=<PATH_TO_TENSORRT_TAR_FILE>
You can install specific versions of various backends using CM as follows (optional):
See this PR prepared by the open taskforce during the public hackathon to add Neural Magic's Deepsparse BERT backend for MLPerf to the CM automation.
We currently support BERT large model int 8 targeting CPU only. CUDA may come soon...
cm run script "get generic-python-lib _onnxruntime" (--version=...)
cm run script "get generic-python-lib _onnxruntime_gpu" (--version=...)
cm run script "get generic-python-lib _torch" (--version=...)
cm run script "get generic-python-lib _torch_cuda" (--version=...)
cm run script "get generic-python-lib _tensorflow" (--version=...)
cm run script "get tensorflow from-src" (--version=...)
cm run script "get tensorflow from-src _tflite" (--version=...)
cm run script --tags=get,tensorrt (--tar_file=<PATH_TO_DOWNLOADED_TENSORRT_PACKAGE_FILE>)
cm run script "get generic-python-lib _apache-tvm" (--version=...)
Please follow this tutorial to run MLPerf with power measurements using CM.
You can use this online GUI to generate CM commands to run the MLPerf inference benchmark, generate your submission and add your results to a temporal W&B dashboard.
Alternatively, you can use your own local GUI to run this benchmark as follows:
cm run script --tags=gui \
--script="run mlperf inference generate-run-cmds" \
--prefix="gnome-terminal --"
You are welcome to join the open MLCommons taskforce on automation and reproducibility to contribute to this project and continue optimizing this benchmark and prepare an official submission for MLPerf inference benchmarks with the free help of the community.
See the development roadmap here.
- Grigori Fursin (MLCommons, cTuning foundation, cKnowledge Ltd)
- Arjun Suresh (MLCommons, cTuning foundation)
We thank Hai Ah Nam, Steve Leak, Vijay Janappa Reddi, Tom Jablin, Ramesh N Chukka, Peter Mattson, David Kanter, Pablo Gonzalez Mesa, Thomas Zhu, Thomas Schmid and Gaurav Verma for their suggestions and contributions.