Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better HTTP Error Handling #199

Merged
merged 2 commits into from
Dec 31, 2024
Merged

Better HTTP Error Handling #199

merged 2 commits into from
Dec 31, 2024

Conversation

ludomitch
Copy link
Contributor

Currently, when we run a rollout and the environment fails, the reason for failure is not included in the traceback. Instead we get the following unhelpful traceback

Traceback (most recent call last):
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 42, in reraise_exc_as
    yield
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 229, in _rollout
    obs, tools = await env.reset()
                 ^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 102, in reset
    return await super().reset()
           ^^^^^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 46, in reset
    response = await self._post(
               ^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 42, in _post
    response.raise_for_status()
  File "aviary-internal/.venv/lib/python3.12/site-packages/httpx/_models.py", line 763, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Traceback (most recent call last):               
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 42, in reraise_exc_as
    yield
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 229, in _rollout
    obs, tools = await env.reset()
                 ^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 102, in reset
    return await super().reset()
           ^^^^^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 46, in reset
    response = await self._post(
               ^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 42, in _post
    response.raise_for_status()
  File "aviary-internal/.venv/lib/python3.12/site-packages/httpx/_models.py", line 763, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 228, in _rollout
    with reraise_exc_as(EnvError, enabled=self.catch_env_failures):
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 46, in reraise_exc_as
    raise reraise(e) from e
ldp.alg.rollout.EnvError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

Traceback (most recent call last):
  File "aviary-internal/.venv/bin/run_expt", line 8, in <module>
    sys.exit(_run_expt())
             ^^^^^^^^^^^
  File "aviary-internal/aviary_internal/utils/configurable.py", line 92, in _run_expt
    asyncio.run(expt.run())
  File ".local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File ".local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "aviary-internal/envs/data-analysis/data_analysis/expts/eval.py", line 112, in run
    await evaluator.run()
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/runners.py", line 143, in run
    await _run_eval_loop(
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/runners.py", line 55, in _run_eval_loop
    trajectories = await rollout_manager.sample_trajectories(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 125, in sample_trajectories
    return await self._sample_trajectories_from_envs(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 196, in _sample_trajectories_from_envs
    await asyncio.gather(
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 261, in _rollout
    await store_step(
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 213, in store_step
    await asyncio.gather(*[
  File "aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/callbacks.py", line 170, in after_transition
    assert traj_id in self.out_files

Proposed changes
I've updated the error handling for http responses to include the reason for env failure at the top of the traceback:

2024-12-30 15:05:55,479 - ldp.alg.rollout - ERROR - Caught env exception:
Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Status code: 500
Server Traceback:
    Traceback (most recent call last):
      File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/dataset_server.py", line 216, in handle_exc_as_http_exc
        yield
      File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/dataset_server.py", line 123, in reset
        obs, tools = await env.reset()
                     ^^^^^^^^^^^^^^^^^
      File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/envs/data-analysis/data_analysis/env.py", line 72, in reset
        fake = error
               ^^^
    NameError: name 'error' is not defined
    
Traceback (most recent call last):               
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 243, in _rollout
    with reraise_exc_as(EnvError, enabled=self.catch_env_failures):
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 61, in reraise_exc_as
    raise reraise(e) from None
ldp.alg.rollout.EnvError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/bin/run_expt", line 8, in <module>
    sys.exit(_run_expt())
             ^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/aviary_internal/utils/configurable.py", line 92, in _run_expt
    asyncio.run(expt.run())
  File "/Users/ludovicomitchener/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 194, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/.local/share/uv/python/cpython-3.12.7-macos-aarch64-none/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/envs/data-analysis/data_analysis/expts/eval.py", line 112, in run
    await evaluator.run()
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/runners.py", line 143, in run
    await _run_eval_loop(
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/runners.py", line 55, in _run_eval_loop
    trajectories = await rollout_manager.sample_trajectories(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 140, in sample_trajectories
    return await self._sample_trajectories_from_envs(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 211, in _sample_trajectories_from_envs
    await asyncio.gather(
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 283, in _rollout
    await store_step(
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 228, in store_step
    await asyncio.gather(*[
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/callbacks.py", line 170, in after_transition
    assert traj_id in self.out_files

Copy link
Collaborator

@jamesbraza jamesbraza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -42,8 +42,21 @@ def reraise_exc_as(reraise: type[CaughtError], enabled: bool) -> Iterator[None]:
yield
except Exception as e:
if enabled:
logger.exception(f"Caught {reraise.exc_type} exception.")
raise reraise(e) from e
# Add more detailed error information
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, and I like how you set it up with duck typing to support multiple possible HTTP error types. The one tough part of duck typed code is making changes are a bit harder, so it would be good to unit test this.

Do you mind

  1. Moving this to a free function in ldp/utils.py
  2. Making a brief unit test of it, with the particular error type we support

Also, feel free to ignore this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid comments, thanks – updated

Comment on lines +99 to +105
error_details = f"{error!s}"

if hasattr(error, "response"):
error_details += f"\nStatus code: {error.response.status_code}"
try:
response_data = error.response.json()
if "detail" in response_data:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
error_details = f"{error!s}"
if hasattr(error, "response"):
error_details += f"\nStatus code: {error.response.status_code}"
try:
response_data = error.response.json()
if "detail" in response_data:
error_details = [f"{error!s}"]
error_details.append(f"\nStatus code: {error.response.status_code}")
error_details.extend((
"\nServer Traceback:\n ",
"\n ".join(response_data["detail"].split("\n")),
"\n",
))
error_details.append(f"\nResponse body: {error.response.text}")
return "".join(error_details)

Copy link

codeflash-ai bot commented Dec 30, 2024

⚡️ Codeflash found optimizations for this PR

📄 38% (0.38x) speedup for format_error_details in ldp/utils.py

⏱️ Runtime : 902 microseconds 656 microseconds (best of 143 runs)

📝 Explanation and details

Here's an optimized version of the original Python program.

made.

  1. Using a list to accumulate parts of the error string and joining them at the end. String concatenation in a loop using += is less efficient than this method.
  2. Simplified the traceback processing by using "\n ".join() for the split lines to reduce repeated string concatenation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 3 Passed
🌀 Generated Regression Tests 21 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage undefined
⚙️ Existing Unit Tests Details
- codeflash_concolic_jtjebwfs/tmps1heefup/test_concolic_coverage.py
- test_utils.py
🌀 Generated Regression Tests Details
import pytest  # used for our unit tests
from ldp.utils import format_error_details


# Mocking HTTP responses for testing
class MockResponse:
    def __init__(self, status_code=200, json_data=None, text=""):
        self.status_code = status_code
        self._json_data = json_data
        self.text = text

    def json(self):
        if self._json_data is not None:
            return self._json_data
        raise ValueError("No JSON data")
from ldp.utils import format_error_details

# unit tests

def test_basic_exception_handling():
    # Standard Exception without Response Attribute
    error = ValueError("A simple error message")
    codeflash_output = format_error_details(error)

    # Exception with Custom Message
    error = Exception("Custom error message with special characters: !@#$%^&*()")
    codeflash_output = format_error_details(error)

def test_http_exception_handling():
    # HTTP Exception with Response Attribute and Status Code
    error = Exception()
    error.response = MockResponse(status_code=404)
    codeflash_output = format_error_details(error)

    # HTTP Exception with Response Attribute, Status Code, and JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=400, json_data={"detail": "Error details\nMore details"})
    expected_output = "Exception\nStatus code: 400\nServer Traceback:\n    Error details\n    More details\n"
    codeflash_output = format_error_details(error)

def test_error_handling_in_json_parsing():
    # HTTP Exception with Non-JSON Response Body
    error = Exception()
    error.response = MockResponse(status_code=502, text="Bad Gateway")
    codeflash_output = format_error_details(error)

    # HTTP Exception with Malformed JSON Response
    error = Exception()
    error.response = MockResponse(status_code=500, text='{"detail": "Incomplete JSON')
    codeflash_output = format_error_details(error)

def test_edge_cases():
    # Exception with Empty Message
    error = ValueError("")
    codeflash_output = format_error_details(error)

    # HTTP Exception with Empty JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=400, json_data={"detail": ""})
    codeflash_output = format_error_details(error)

def test_large_scale_cases():
    # HTTP Exception with Large JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=500, json_data={"detail": "Error details\n" * 1000})
    expected_output = "Exception\nStatus code: 500\nServer Traceback:\n" + "    Error details\n" * 1000
    codeflash_output = format_error_details(error)

def test_rare_unexpected_edge_cases():
    # Exception with Non-Standard Attributes
    class CustomExceptionWithAttributes(Exception):
        def __init__(self, message, code, extra_info):
            super().__init__(message)
            self.code = code
            self.extra_info = extra_info

    error = CustomExceptionWithAttributes("Custom error", 123, "Extra information")
    codeflash_output = format_error_details(error)

    # HTTP Exception with Array in JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=400, json_data={"detail": ["Error 1", "Error 2"]})
    codeflash_output = format_error_details(error)

    # Exception Message with Unicode Characters
    error = ValueError("Ошибка: неверный ввод")
    codeflash_output = format_error_details(error)

    # HTTP Exception with Deeply Nested JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=500, json_data={"detail": {"level1": {"level2": {"level3": {"level4": "Deep error message"}}}}})
    codeflash_output = format_error_details(error)

    # HTTP Exception with Circular Reference in JSON
    circular_data = {}
    circular_data["detail"] = circular_data
    error = Exception()
    error.response = MockResponse(status_code=500, json_data=circular_data)
    codeflash_output = format_error_details(error)

    # HTTP Exception with Mixed Types in JSON Detail
    error = Exception()
    error.response = MockResponse(status_code=400, json_data={"detail": {"message": "Error occurred", "code": 123, "info": ["Detail 1", "Detail 2"], "nested": {"key": "value"}}})
    codeflash_output = format_error_details(error)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from unittest.mock import MagicMock

# imports
import pytest  # used for our unit tests
from ldp.utils import format_error_details

# unit tests

def test_basic_exception():
    # Test standard exception without response attribute
    error = Exception("A basic error occurred")
    codeflash_output = format_error_details(error)

def test_value_error():
    # Test ValueError without response attribute
    error = ValueError("Invalid value provided")
    codeflash_output = format_error_details(error)




def test_empty_exception_message():
    # Test exception with empty message
    error = Exception("")
    codeflash_output = format_error_details(error)






def test_exception_with_custom_attributes():
    # Test exception with additional custom attributes
    error = Exception("Custom error with additional attributes")
    error.code = 1234
    codeflash_output = format_error_details(error)

def test_exception_with_unicode_characters():
    # Test exception with unicode characters in message
    error = Exception("Error with unicode: üñîçødë")
    codeflash_output = format_error_details(error)


def test_exception_with_no_message():
    # Test exception with no message
    error = Exception()
    codeflash_output = format_error_details(error)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

📢 Feedback on this optimization? Discord

raise reraise(e) from e
error_details = format_error_details(e)
logger.exception(f"Caught {reraise.exc_type} exception:\n{error_details}")
raise reraise(e) from None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why from None instead of from e? I might be missing something obvious here

Copy link
Contributor Author

@ludomitch ludomitch Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error chaining was adding redundant information. Have a look at the traceback with chaining below:

With chaining (from e)

2024-12-30 20:41:29,924 - ldp.alg.rollout - ERROR - Caught env exception.
Traceback (most recent call last):
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 43, in reraise_exc_as
    yield
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 230, in _rollout
    obs, tools = await env.reset()
                 ^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 102, in reset
    return await super().reset()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 46, in reset
    response = await self._post(
               ^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 42, in _post
    response.raise_for_status()
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/httpx/_models.py", line 763, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500
Traceback (most recent call last):               
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 43, in reraise_exc_as
    yield
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/ldp/alg/rollout.py", line 230, in _rollout
    obs, tools = await env.reset()
                 ^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 102, in reset
    return await super().reset()
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 46, in reset
    response = await self._post(
               ^^^^^^^^^^^^^^^^^
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/aviary/env_client.py", line 42, in _post
    response.raise_for_status()
  File "/Users/ludovicomitchener/Desktop/repos/aviary-internal/.venv/lib/python3.12/site-packages/httpx/_models.py", line 763, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Server error '500 Internal Server Error' for url 'http://localhost:8080/reset'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/500

It doesn't include the source of the http error and shows the reraise_exc_as traceback twice.

@ludomitch ludomitch merged commit 22529f5 into main Dec 31, 2024
6 checks passed
@ludomitch ludomitch deleted the better-http-error-handling branch December 31, 2024 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants