Specifying scorer model causes CUDA OOM error #927

alexdzm · 2024-12-02T08:24:49Z

Summary

When evaluating a local model using hugging face, I tried to use a separate scorer model defined through a string in my scorer. Unfortunately this led to the scorer model being repeatedly loaded by my machine until a CUDA out of memory error occurred. This occurs even with max_connections, max_subprocesses and max_tasks set to 1.

Steps to reproduce

Run sycophancy from this PR into inspect_evals, but first manually set scorer_model in sycophancy_scorer() with a string to be a huggingface model. This bug happens with the following (model, scorer_model) combinations: (hugging face, OpenAI), (hugging face, hugging face), (OpenAI, hugging face). It does not occur when the scorer_model is None.

I have tested this with 1B models on:

An EC2 instance with an A10G with 24GB VRAM
My M3 Pro MacBook Pro with 32GB unified memory.

Proposed solution

Supplement the active_model_context_var (retrieved by get_model) with active_scorer_model_context_var to enable a single instance of a scorer model to be persisted throughout the evaluation. A scorer bool can be given as input to get_model to access it. I think this provides a balance between supporting what I assume to be a common usage pattern (separate scorer model) without making it hard for users to deliberately use different instances of a model.

The text was updated successfully, but these errors were encountered:

jjallaire · 2024-12-03T12:40:10Z

I think we have this resolved by creating the model in the initialization of the scorer rather than on demand? Closing this issue (feel free to re-open if there are other things we should pursue)

jjallaire closed this as completed Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specifying scorer model causes CUDA OOM error #927

Specifying scorer model causes CUDA OOM error #927

alexdzm commented Dec 2, 2024 •

edited

Loading

jjallaire commented Dec 3, 2024

Specifying scorer model causes CUDA OOM error #927

Specifying scorer model causes CUDA OOM error #927

Comments

alexdzm commented Dec 2, 2024 • edited Loading

Summary

Steps to reproduce

Proposed solution

jjallaire commented Dec 3, 2024

alexdzm commented Dec 2, 2024 •

edited

Loading