Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample_concatenate_permutation ineffective in user.conf #2028

Open
psyhtest opened this issue Jan 13, 2025 · 11 comments
Open

sample_concatenate_permutation ineffective in user.conf #2028

psyhtest opened this issue Jan 13, 2025 · 11 comments
Assignees

Comments

@psyhtest
Copy link
Contributor

We would like to run a custom model under the same rules as llama2-70b, i.e. to visit each sample of the OpenOrca dataset at least once.

However, when we place the following line in user.conf (custom is the name of the model):

custom.Offline.sample_concatenate_permutation = 1

it doesn't affect the execution, with mlperf_log_detail.txt confirming that sample_concatenate_permutation is still disabled:

:::MLLOG {"key": "requested_sample_concatenate_permutation", "value": false,
...
:::MLLOG {"key": "effective_sample_concatenate_permutation", "value": false,

Only when we place this line in mlperf.conf and rebuild LoadGen, we can see it taking effect, with mlperf_log_detail.txt confirming that sample_concatenate_permutation is now enabled:

mlperf_log_detail.txt::::MLLOG {"key": "requested_sample_concatenate_permutation", "value": true, "time_ms": 0.022396, "namespace": "mlperf::logging", "event_type": "POINT_IN_TIME", "metadata": {"is_error": false, "is_warning": false, "file": "test_settings_internal.cc", "line_no": 339, "pid": 34836, "tid": 34836}}
mlperf_log_detail.txt::::MLLOG {"key": "effective_sample_concatenate_permutation", "value": true, "time_ms": 0.023754, "namespace": "mlperf::logging", "event_type": "POINT_IN_TIME", "metadata": {"is_error": false, "is_warning": false, "file": "test_settings_internal.cc", "line_no": 453, "pid": 34836, "tid": 34836}}

But then we get an error:

:::MLLOG {"key": "error_uncommitted_loadgen_changes", "value": "Loadgen built with uncommitted changes!", "time_ms": 0.006177, "namespace": "mlperf::logging", "event_type": "POINT_IN_TIME", "metadata": {"is_error": true, "is_warning": false, "file": "version.cc", "line_no": 63, "pid": 34836, "tid": 34836}}

Since mlperf.conf should now be static, we consider this behaviour a bug in LoadGen.

@arjunsuresh
Copy link
Contributor

arjunsuresh commented Jan 13, 2025

@psyhtest For running custom open models, we can still use the official model name in user.conf right? i.e.,

llama2-70b.Offline ....

Is there a requirement to use the custom model name? This is something we have never tried before.

@psyhtest
Copy link
Contributor Author

Is there a requirement to use the custom model name? This is something we have never tried before.

This works for all other parameters we tried e.g. min_query_count and target_qps:

custom.Offline.min_query_count = 24576
custom.Offline.performance_sample_count_override = 24576
custom.Offline.target_qps = 123
custom.Offline.coalesce_queries = 1

Using a fixed official name for every unofficial model may lead to confusion.

@arjunsuresh
Copy link
Contributor

arjunsuresh commented Jan 13, 2025

Yes @psyhtest - performace_sample_count_override option is not expected to be changed by a user and hence part of only mlperf.conf

https://github.com/mlcommons/inference/blob/master/loadgen/test_settings_internal.cc#L700

Using a fixed official name for every unofficial model may lead to confusion.

Why it should be a confusion? For example, all open sparse model submissions from NeuralMagic were done using the official model name bert in the user.conf. If we are running user.conf with custom model names, loadgen won't have any control over the benchmark right?

@arjunsuresh
Copy link
Contributor

Nvidia is using * in the user.conf file to make the config apply to any model.

https://github.com/mlcommons/inference_results_v4.1/blob/main/open/NVIDIA/measurements/Orin_TRT_DepthPruned/llama2-70b-99/Offline/user.conf

NeuralMagic is using the official model name for open models in the user.conf.

https://github.com/mlcommons/inference_results_v4.1/blob/main/open/NeuralMagic/measurements/4xH100-SXM-80GB_vLLM_GPTQ-reference-cpu-pytorch-v2.3.1-default_config/llama2-70b-99/offline/user.conf

If there is a reason to allow sample_concatenate_permutation to be overridden by the user, we can certainly do it. I believe using * or the official name will make the open results better consistent.

@psyhtest
Copy link
Contributor Author

It would be rather weird to have to use, for example, llama2-70b instead of llama3.2-1b. It would also be not possible to use the same user.conf file to configure several models e.g. llama3.2-1b and llama3.2-3b, if one so wishes (because mlperf.conf does exactly that). The same problem would be with using *.

Then, if we use a generic name for several workloads, how do we distinguish between results for different workloads in the results table?

@psyhtest
Copy link
Contributor Author

Also, what if we want to explore, say, one or more sets of ultra low latency constraints (lower than "low latency")?

@arjunsuresh
Copy link
Contributor

It would be rather weird to have to use, for example, llama2-70b instead of llama3.2-1b. It would also be not possible to use the same user.conf file to configure several models e.g. llama3.2-1b and llama3.2-3b, if one so wishes (because mlperf.conf does exactly that). The same problem would be with using *.

Then, if we use a generic name for several workloads, how do we distinguish between results for different workloads in the results table?

The name is only for the use in "user.conf". In the "models" directory under results, measurements etc, custom names can be used like done here

@arjunsuresh
Copy link
Contributor

Also, what if we want to explore, say, one or more sets of ultra low latency constraints (lower than "low latency")?

Currently you can change the latency constraints by using the user.conf file and the official model name right?

If you think "custom model name" must be allowed to be used in user.conf file, we can discuss that in tomorrow WG meeting. The problem is that then anyone can override anything and submission checker will need to capture all invalid configurations.

@arjunsuresh
Copy link
Contributor

Hi @psyhtest currently only the below flags are restricted to "only" mlperf.conf file. Anything else, you should be able to load via user.conf.

  1. qsl_rng_seed
  2. sample_index_rng_seed
  3. schedule_rng_seed
  4. sample_concatenate_permutation
  5. accuracy_log_probability

I have added a PR to allow sample_concatenate_permutation in user.conf as it is useful for doing a quick test run.
#2035

Other parameters in (only in) mlperf.conf are fine right?

@psyhtest
Copy link
Contributor Author

Thanks @arjunsuresh! Yes, the random number generator seeds should definitely be read-only for each round. And changing accuracy_log_probability could affect TEST01 results, so it's a good idea to have it read-only too.

@arjunsuresh
Copy link
Contributor

Thank you @psyhtest for confirming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants