[Bug]: SWE-Bench inference - Failed to establish a new connection: [Errno 111] Connection refused #4260

jatinganhotra · 2024-10-08T02:31:34Z

Is there an existing issue for the same bug?

I have checked the troubleshooting document at https://docs.all-hands.dev/modules/usage/troubleshooting
I have checked the existing issues.

Describe the bug

Hi team,

When I am trying to run inference for SWE-Bench Lite with > 1 worker, I am getting the following error. The inference runs OK with only 1 worker, which is the default value.

./evaluation/swe_bench/scripts/run_infer.sh MODEL_CONFIG with the default CodeActAgent

I'm getting the following error

Instance django__django-10914 - 2024-10-07 15:21:14,902 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=34090): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfffde2390>: Failed to establish a new connection: [Errno 111] Connection refused'))
Instance astropy__astropy-12907 - 2024-10-07 15:21:19,293 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=30607): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfffb425d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Instance astropy__astropy-14365 - 2024-10-07 15:21:24,839 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=32191): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfffda1610>: Failed to establish a new connection: [Errno 111] Connection refused'))
Instance astropy__astropy-7746 - 2024-10-07 15:21:25,875 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=37017): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfffd07510>: Failed to establish a new connection: [Errno 111] Conn

Stack trace:

----------[The above error occurred. Retrying... (attempt 3 of 5)]----------

Instance django__django-11001 - 2024-10-07 15:16:53,257 - WARNING - Action, ErrorObservation loop detected
Instance django__django-11001 - 2024-10-07 15:16:53,259 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=38197): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfb94fcd10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Instance django__django-11001 - 2024-10-07 15:16:53,261 - ERROR - Error during action execution: HTTPConnectionPool(host='localhost', port=38197): Max retries exceeded with url: /execute_action (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fdfffcdc610>: Failed to establish a new connection: [Errno 111] Connection refused'))
Instance django__django-11001 - 2024-10-07 15:16:53,261 - ERROR - ----------
Error in instance [django__django-11001]: 'ErrorObservation' object has no attribute 'exit_code'. Stacktrace:
Traceback (most recent call last):
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/utils/shared.py", line 268, in _process_instance_wrapper
    result = process_instance_func(instance, metadata, use_mp)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/swe_bench/run_infer.py", line 367, in process_instance
    return_val = complete_runtime(runtime, instance)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/swe_bench/run_infer.py", line 287, in complete_runtime
    assert obs.exit_code == 0
           ^^^^^^^^^^^^^
AttributeError: 'ErrorObservation' object has no attribute 'exit_code'

----------[The above error occurred. Retrying... (attempt 3 of 5)]----------

----------
Error in instance [django__django-11001]: 'ErrorObservation' object has no attribute 'exit_code'. Stacktrace:
Traceback (most recent call last):
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/utils/shared.py", line 268, in _process_instance_wrapper
    result = process_instance_func(instance, metadata, use_mp)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/swe_bench/run_infer.py", line 367, in process_instance
    return_val = complete_runtime(runtime, instance)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/workspace/jatinganhotra/OpenDevin/evaluation/swe_bench/run_infer.py", line 287, in complete_runtime
    assert obs.exit_code == 0
           ^^^^^^^^^^^^^
AttributeError: 'ErrorObservation' object has no attribute 'exit_code'

STDOUT logs at the beginning

Number of workers not specified, use default 16
Commit hash not specified, use current git commit
Agent not specified, use default CodeActAgent
MAX_ITER not specified, use default 30
USE_INSTANCE_IMAGE not specified, use default true
DATASET not specified, use default princeton-nlp/SWE-bench_Lite
SPLIT not specified, use default test
USE_INSTANCE_IMAGE: true
AGENT: CodeActAgent
AGENT_VERSION: v1.9
MODEL_CONFIG: eval_vllm_vela_mistral_large_2
DATASET: princeton-nlp/SWE-bench_Lite
SPLIT: test
USE_HINT_TEXT: false
EVAL_NOTE: v1.9-no-hint
14:48:49 - openhands:INFO: run_infer.py:93 - Using docker image prefix: docker.io/xingyaoww/
14:48:56 - openhands:INFO: run_infer.py:441 - Loaded dataset princeton-nlp/SWE-bench_Lite with split test
14:48:56 - openhands:INFO: utils.py:258 - Loading llm config from eval_vllm_vela_mistral_large_2
14:48:56 - openhands:INFO: shared.py:165 - Using evaluation output directory: evaluation/evaluation_outputs/outputs/swe-bench-lite/CodeActAgent/mistral-large-instruct-2407_maxiter_30_N_v1.9-no-hint
14:48:56 - openhands:INFO: shared.py:181 - Metadata: {"agent_class": "CodeActAgent", "llm_config": {"model": "openai/mistral-large-instruct-2407", "api_key": "******", "base_url": "BASE_URL", "api_version": null, "embedding_model": "local", "embedding_base_url": null, "embedding_deployment_name": null, "aws_access_key_id": null, "aws_secret_access_key": null, "aws_region_name": null, "openrouter_site_url": "https://docs.all-hands.dev/", "openrouter_app_name": "OpenHands", "num_retries": 8, "retry_multiplier": 2, "retry_min_wait": 15, "retry_max_wait": 120, "timeout": null, "max_message_chars": 10000, "temperature": 0.0, "top_p": 1.0, "custom_llm_provider": null, "max_input_tokens": null, "max_output_tokens": null, "input_cost_per_token": null, "output_cost_per_token": null, "ollama_base_url": null, "drop_params": true, "disable_vision": null, "caching_prompt": true, "log_completions": false}, "max_iterations": 30, "eval_output_dir": "evaluation/evaluation_outputs/outputs/swe-bench-lite/CodeActAgent/mistral-large-instruct-2407_maxiter_30_N_v1.9-no-hint", "start_time": "2024-10-07 14:48:56", "git_commit": "dd228c07e05b6908bc1d15dde8f8025284a9ef47", "dataset": "swe-bench-lite", "data_split": null, "details": {}}
14:48:56 - openhands:INFO: shared.py:199 - Writing evaluation output to evaluation/evaluation_outputs/outputs/swe-bench-lite/CodeActAgent/mistral-large-instruct-2407_maxiter_30_N_v1.9-no-hint/output.jsonl
14:48:56 - openhands:INFO: shared.py:232 - Finished instances: 0, Remaining instances: 300

Current OpenHands version

Commit - dd228c07e05b6908bc1d15dde8f8025284a9ef47

Installation and Configuration

> ./evaluation/swe_bench/scripts/run_infer.sh MODEL_CONFIG
Number of workers not specified, use default 16
Commit hash not specified, use current git commit
Agent not specified, use default CodeActAgent
MAX_ITER not specified, use default 30
USE_INSTANCE_IMAGE not specified, use default true
DATASET not specified, use default princeton-nlp/SWE-bench_Lite
SPLIT not specified, use default test
USE_INSTANCE_IMAGE: true
AGENT: CodeActAgent
AGENT_VERSION: v1.9
MODEL_CONFIG: MODEL_CONFIG
DATASET: princeton-nlp/SWE-bench_Lite
SPLIT: test
USE_HINT_TEXT: false
EVAL_NOTE: v1.9-no-hint



### Model and Agent

_No response_

### Operating System

_No response_

### Reproduction Steps

_No response_

### Logs, Errors, Screenshots, and Additional Context

_No response_

The text was updated successfully, but these errors were encountered:

xingyaoww · 2024-10-08T02:35:38Z

Yes - i think that's somewhat expected behavior - docker acts weirdly when you try to run multiple images at once.

You can consider join our eval channel #remote-runtime-limited-beta to get access to our new infra for eval in parallel: https://www.all-hands.dev/blog/evaluation-of-llms-as-coding-agents-on-swe-bench-at-30x-speed

mamoodi · 2024-10-08T16:02:17Z

@xingyaoww just to clarify, when you say this is expected behavior, do you mean this will likely not be fixed?
In the README: https://github.com/All-Hands-AI/OpenHands/tree/main/evaluation/swe_bench
It specifically allows you to set number of workers

xingyaoww · 2024-10-08T16:10:43Z

Yeah i think so - maybe we should make this clearer on the README there

jatinganhotra added the bug Something isn't working label Oct 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: SWE-Bench inference - Failed to establish a new connection: [Errno 111] Connection refused #4260

[Bug]: SWE-Bench inference - Failed to establish a new connection: [Errno 111] Connection refused #4260

jatinganhotra commented Oct 8, 2024

xingyaoww commented Oct 8, 2024

mamoodi commented Oct 8, 2024

xingyaoww commented Oct 8, 2024

[Bug]: SWE-Bench inference - Failed to establish a new connection: [Errno 111] Connection refused #4260

[Bug]: SWE-Bench inference - Failed to establish a new connection: [Errno 111] Connection refused #4260

Comments

jatinganhotra commented Oct 8, 2024

Is there an existing issue for the same bug?

Describe the bug

Current OpenHands version

Installation and Configuration

xingyaoww commented Oct 8, 2024

mamoodi commented Oct 8, 2024

xingyaoww commented Oct 8, 2024