Issue with running HEIM #3080

sudhir-mcw · 2024-10-22T09:44:14Z

HI @teetone
I am trying to try out heim with the following command and was facing the issue from heim documentation

helm-run --run-entries mscoco:model=huggingface/stable-diffusion-v1-4 --suite my-heim-suite --max-eval-instances 1

HuggingFaceDiffusersClient error: Failed to import diffusers.pipelines.stable_diffusion because of the following error (look up to see its traceback):
'Config' object has no attribute 'define_bool_state'
Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)

File "helm/src/helm/benchmark/window_services/window_service_factory.py", line 17, in get_window_service
model_deployment: Optional[ModelDeployment] = get_model_deployment(model_deployment_name)
File "helm/src/helm/benchmark/model_deployment_registry.py", line 132, in get_model_deployment
raise ValueError(f"Model deployment {name} not found")
ValueError: Model deployment openai/clip-vit-large-patch14 not found

0%| | 0/1 [00:35<?, ?it/s]
} [37.279s]
Traceback (most recent call last):
File "helm/src/helm/benchmark/run.py", line 380, in
main()
File "helm/src/helm/common/hierarchical_logger.py", line 104, in wrapper
return fn(*args, **kwargs)
File "helm/src/helm/benchmark/run.py", line 351, in main
run_benchmarking(
File "helm/src/helm/benchmark/run.py", line 128, in run_benchmarking
runner.run_all(run_specs)
File "helm/src/helm/benchmark/runner.py", line 226, in run_all
raise RunnerError(f"Failed runs: [{failed_runs_str}]")
helm.benchmark.runner.RunnerError: Failed runs: ["mscoco:model=huggingface_stable-diffusion-v1-4"]

Here is information on my setup
conda env Python 3.9.20
I installed heim using the build from source instead of using pip, as pip version was taking quite a long time to resolve the dependencies
Here are the steps i used to install

cd helm
pip install -r requirements.txt
pip install -e .[all]

I checked the community forum and tried replacing jax version to latest as well, but still no luck

jax==0.4.30
jaxlib==0.4.30

Are there any other installation and quick start documentation related to heim apart from heim.md in the docs ?

The text was updated successfully, but these errors were encountered:

yifanmai · 2024-10-22T23:55:08Z

The likely cause is that you have not run install-heim-extras.sh as explained in the HEIM docs; could you try that and see if that fixes things?

Sorry that this was not clearly explained in the documentation. I've updated the documentation to make things more clear.

sudhir-mcw · 2024-10-23T09:45:12Z

Hi @yifanmai, Thanks for the reply.
I tried once again after installing the install-heim-extras.sh,
The process gets interrupted with the following error

AestheticsMetric() {
    Parallelizing computation on 1 items over 4 threads {

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.58s/it]
} [8.579s]██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:08<00:00, 8.58s/it]
} [8.58s]
CLIPScoreMetric(multilingual=False) {
Parallelizing computation on 1 items over 4 threads {
0%| | 0/1 [00:00<?, ?it/s]
} [0.002s] | 0/1 [00:00<?, ?it/s]
} [0.002s]
} [14.125s]
} [6m14.466s]
Error when running mscoco:model=huggingface_stable-diffusion-v1-4:
Traceback (most recent call last):
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/runner.py", line 216, in run_all
self.run_one(run_spec)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/runner.py", line 307, in run_one
metric_result: MetricResult = metric.evaluate(
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/metric.py", line 143, in evaluate
results: List[List[Stat]] = parallel_map(
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/common/general.py", line 235, in parallel_map
results = list(tqdm(executor.map(process, items), total=len(items), disable=None))
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 609, in result_iterator
yield fs.pop().result()
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/lib/python3.9/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/metric.py", line 77, in process
self.metric.evaluate_generation(
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/metrics/image_generation/clip_score_metrics.py", line 58, in evaluate_generation
prompt = WindowServiceFactory.get_window_service(model, metric_service).truncate_from_right(prompt)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/window_services/window_service_factory.py", line 17, in get_window_service
model_deployment: Optional[ModelDeployment] = get_model_deployment(model_deployment_name)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/model_deployment_registry.py", line 130, in get_model_deployment
raise ValueError(f"Model deployment {name} not found")
ValueError: Model deployment openai/clip-vit-large-patch14 not found

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [06:14<00:00, 374.49s/it]
} [6m21.356s]
Traceback (most recent call last):
File "/mnt/gpu-perf-test-storage/sudhir/miniconda3/envs/crfm-helm/bin/helm-run", line 8, in
sys.exit(main())
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/common/hierarchical_logger.py", line 104, in wrapper
return fn(*args, **kwargs)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/run.py", line 350, in main
run_benchmarking(
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/run.py", line 127, in run_benchmarking
runner.run_all(run_specs)
File "/mnt/gpu-perf-test-storage/sudhir/helm/src/helm/benchmark/runner.py", line 225, in run_all
raise RunnerError(f"Failed runs: [{failed_runs_str}]")
helm.benchmark.runner.RunnerError: Failed runs: ["mscoco:model=huggingface_stable-diffusion-v1-4"]

It's runnning fine till aesthetic metrics, it gets stopped at clip score calculation, Are there any configuration I am missing on?

sudhir-mcw changed the title ~~Issues with running HEIM~~ Issue with running HEIM Oct 22, 2024

yifanmai added documentation Improvements or additions to documentation user question HEIM (Text2Image) labels Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with running HEIM #3080

Issue with running HEIM #3080

sudhir-mcw commented Oct 22, 2024 •

edited

Loading

yifanmai commented Oct 22, 2024

sudhir-mcw commented Oct 23, 2024

Issue with running HEIM #3080

Issue with running HEIM #3080

Comments

sudhir-mcw commented Oct 22, 2024 • edited Loading

yifanmai commented Oct 22, 2024

sudhir-mcw commented Oct 23, 2024

sudhir-mcw commented Oct 22, 2024 •

edited

Loading