From 00c650fdd6fe1e47ed66885ff27d960eb0129b77 Mon Sep 17 00:00:00 2001 From: Arjun Suresh Date: Fri, 20 Sep 2024 17:26:29 +0100 Subject: [PATCH] Improve the SCC24 docs --- .../text_to_image/reproducibility/scc24.md | 41 ++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) diff --git a/docs/benchmarks/text_to_image/reproducibility/scc24.md b/docs/benchmarks/text_to_image/reproducibility/scc24.md index b55008bbc..4f66097a6 100644 --- a/docs/benchmarks/text_to_image/reproducibility/scc24.md +++ b/docs/benchmarks/text_to_image/reproducibility/scc24.md @@ -3,8 +3,47 @@ hide: - toc --- -# Text to Image using Stable Diffusion for Student Cluster Competition 2024 +# Text-to-Image with Stable Diffusion for Student Cluster Competition 2024 +## Introduction + +This guide is designed for the [Student Cluster Competition 2024](https://studentclustercompetition.us/2024/index.html) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Stable Diffusion XL 1.0](https://github.com/mlcommons/inference/tree/master/text_to_image#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy. Since the model performs poorly on CPUs, it is essential to run it on GPUs. + +For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Stable Diffusion XL requires processing a minimum of 5,000 samples in both performance and accuracy modes using the COCO 2014 dataset. However, for SCC, the dataset size has been reduced to 50 samples, making it possible to complete both runs in approximately 5-10 minutes. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring. + +## Scoring + +In the SCC, your first objective will be to run a reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score. + +Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points. + +Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system. + +Additionally, significant bonus points will be awarded if your team enhances an existing implementation, adds support for new hardware (such as Intel GPUs), enables multi-node execution, or improves the MLPerf SDXL model's performance while maintaining accuracy. All improvements must be made publicly available under the Apache 2.0 license and submitted alongside your results to the SCC committee to earn these bonus points, contributing to the MLPerf community. + + +!!! info + Both MLPerf and CM automation are evolving projects. + If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues) + +## Artifacts to submit to the SCC committee + +You will need to submit the following files: + +* `mlperf_submission_short.tar.gz` - automatically generated file with validated MLPerf results. +* `mlperf_submission_short_summary.json` - automatically generated summary of MLPerf results. +* `mlperf_submission_short.run` - CM commands to run MLPerf BERT inference benchmark saved to this file. +* `mlperf_submission_short.tstamps` - execution timestamps before and after CM command saved to this file. +* `mlperf_submission_short.md` - description of your platform and some highlights of the MLPerf benchmark execution. + + +## SCC interview + +You are encouraged to highlight and explain the obtained MLPerf inference throughput on your system +and describe any improvements and extensions to this benchmark (such as adding new hardware backend +or supporting multi-node execution) useful for the community and [MLCommons](https://mlcommons.org). + +## Run Commands === "MLCommons-Python" ## MLPerf Reference Implementation in Python