You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When evaluating a local model using hugging face, I tried to use a separate scorer model defined through a string in my scorer. Unfortunately this led to the scorer model being repeatedly loaded by my machine until a CUDA out of memory error occurred. This occurs even with max_connections, max_subprocesses and max_tasks set to 1.
Steps to reproduce
Run sycophancy from this PR into inspect_evals, but first manually set scorer_model in sycophancy_scorer() with a string to be a huggingface model. This bug happens with the following (model, scorer_model) combinations: (hugging face, OpenAI), (hugging face, hugging face), (OpenAI, hugging face). It does not occur when the scorer_model is None.
I have tested this with 1B models on:
An EC2 instance with an A10G with 24GB VRAM
My M3 Pro MacBook Pro with 32GB unified memory.
Proposed solution
Supplement the active_model_context_var (retrieved by get_model) with active_scorer_model_context_var to enable a single instance of a scorer model to be persisted throughout the evaluation. A scorer bool can be given as input to get_model to access it. I think this provides a balance between supporting what I assume to be a common usage pattern (separate scorer model) without making it hard for users to deliberately use different instances of a model.
The text was updated successfully, but these errors were encountered:
I think we have this resolved by creating the model in the initialization of the scorer rather than on demand? Closing this issue (feel free to re-open if there are other things we should pursue)
Summary
When evaluating a local model using hugging face, I tried to use a separate scorer model defined through a string in my scorer. Unfortunately this led to the scorer model being repeatedly loaded by my machine until a CUDA out of memory error occurred. This occurs even with
max_connections
,max_subprocesses
andmax_tasks
set to 1.Steps to reproduce
Run
sycophancy
from this PR into inspect_evals, but first manually setscorer_model
insycophancy_scorer()
with a string to be a huggingface model. This bug happens with the following (model, scorer_model) combinations: (hugging face, OpenAI), (hugging face, hugging face), (OpenAI, hugging face). It does not occur when the scorer_model is None.I have tested this with 1B models on:
Proposed solution
Supplement the
active_model_context_var
(retrieved byget_model
) withactive_scorer_model_context_var
to enable a single instance of a scorer model to be persisted throughout the evaluation. Ascorer
bool can be given as input toget_model
to access it. I think this provides a balance between supporting what I assume to be a common usage pattern (separate scorer model) without making it hard for users to deliberately use different instances of a model.The text was updated successfully, but these errors were encountered: