Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specifying scorer model causes CUDA OOM error #927

Closed
alexdzm opened this issue Dec 2, 2024 · 1 comment
Closed

Specifying scorer model causes CUDA OOM error #927

alexdzm opened this issue Dec 2, 2024 · 1 comment

Comments

@alexdzm
Copy link

alexdzm commented Dec 2, 2024

Summary

When evaluating a local model using hugging face, I tried to use a separate scorer model defined through a string in my scorer. Unfortunately this led to the scorer model being repeatedly loaded by my machine until a CUDA out of memory error occurred. This occurs even with max_connections, max_subprocesses and max_tasks set to 1.

Steps to reproduce

Run sycophancy from this PR into inspect_evals, but first manually set scorer_model in sycophancy_scorer() with a string to be a huggingface model. This bug happens with the following (model, scorer_model) combinations: (hugging face, OpenAI), (hugging face, hugging face), (OpenAI, hugging face). It does not occur when the scorer_model is None.

I have tested this with 1B models on:

  • An EC2 instance with an A10G with 24GB VRAM
  • My M3 Pro MacBook Pro with 32GB unified memory.

Proposed solution

Supplement the active_model_context_var (retrieved by get_model) with active_scorer_model_context_var to enable a single instance of a scorer model to be persisted throughout the evaluation. A scorer bool can be given as input to get_model to access it. I think this provides a balance between supporting what I assume to be a common usage pattern (separate scorer model) without making it hard for users to deliberately use different instances of a model.

@jjallaire
Copy link
Collaborator

I think we have this resolved by creating the model in the initialization of the scorer rather than on demand? Closing this issue (feel free to re-open if there are other things we should pursue)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants