Skip to content

Commit

Permalink
Improve the SCC24 docs
Browse files Browse the repository at this point in the history
  • Loading branch information
arjunsuresh committed Sep 20, 2024
1 parent 9009382 commit 00c650f
Showing 1 changed file with 40 additions and 1 deletion.
41 changes: 40 additions & 1 deletion docs/benchmarks/text_to_image/reproducibility/scc24.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,47 @@ hide:
- toc
---

# Text to Image using Stable Diffusion for Student Cluster Competition 2024
# Text-to-Image with Stable Diffusion for Student Cluster Competition 2024

## Introduction

This guide is designed for the [Student Cluster Competition 2024](https://studentclustercompetition.us/2024/index.html) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Stable Diffusion XL 1.0](https://github.com/mlcommons/inference/tree/master/text_to_image#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy. Since the model performs poorly on CPUs, it is essential to run it on GPUs.

For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Stable Diffusion XL requires processing a minimum of 5,000 samples in both performance and accuracy modes using the COCO 2014 dataset. However, for SCC, the dataset size has been reduced to 50 samples, making it possible to complete both runs in approximately 5-10 minutes. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring.

## Scoring

In the SCC, your first objective will be to run a reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score.

Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points.

Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system.

Additionally, significant bonus points will be awarded if your team enhances an existing implementation, adds support for new hardware (such as Intel GPUs), enables multi-node execution, or improves the MLPerf SDXL model's performance while maintaining accuracy. All improvements must be made publicly available under the Apache 2.0 license and submitted alongside your results to the SCC committee to earn these bonus points, contributing to the MLPerf community.


!!! info
Both MLPerf and CM automation are evolving projects.
If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)

## Artifacts to submit to the SCC committee

You will need to submit the following files:

* `mlperf_submission_short.tar.gz` - automatically generated file with validated MLPerf results.
* `mlperf_submission_short_summary.json` - automatically generated summary of MLPerf results.
* `mlperf_submission_short.run` - CM commands to run MLPerf BERT inference benchmark saved to this file.
* `mlperf_submission_short.tstamps` - execution timestamps before and after CM command saved to this file.
* `mlperf_submission_short.md` - description of your platform and some highlights of the MLPerf benchmark execution.


## SCC interview

You are encouraged to highlight and explain the obtained MLPerf inference throughput on your system
and describe any improvements and extensions to this benchmark (such as adding new hardware backend
or supporting multi-node execution) useful for the community and [MLCommons](https://mlcommons.org).

## Run Commands

=== "MLCommons-Python"
## MLPerf Reference Implementation in Python
Expand Down

0 comments on commit 00c650f

Please sign in to comment.