Merge branch 'main' into optional-formatting

hynek · Nov 2, 2023 · 15c2105 · 15c2105
2 parents 1932f9b + 9d626bf
commit 15c2105
Show file tree

Hide file tree

Showing 14 changed files with 177 additions and 112 deletions.
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -63,7 +63,7 @@ Or if you prefer to use Git via HTTPS:
 $ git clone https://github.com/YOU/structlog.git
 ```
 
-> **Warning**
+> [!WARNING]
 > - **Before** you start working on a new pull request, use the "*Sync fork*" button in GitHub's web UI to ensure your fork is up to date.
 > - **Always create a new branch off `main` for each new pull request.**
 >   Yes, you can work on `main` in your fork and submit pull requests.

diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -13,14 +13,16 @@ But please read our [contribution guide](https://github.com/hynek/structlog/blob
 If an item doesn't apply to your pull request, **check it anyway** to make it apparent that there's nothing left to do.
 -->
 
+- [ ] Do **not** open pull requests from your `main` branch – **use a separate branch**!
+  - There's a ton of footguns waiting if you don't heed this warning. You can still go back to your project, create a branch from your main branch, push it, and open the pull request from the new branch.
+  - This is not a pre-requisite for your your pull request to be accepted, but **you have been warned**.
 - [ ] Added **tests** for changed code.
     - The CI fails with less than 100% coverage.
-- [ ] **New APIs** are added to [`api.py`](https://github.com/hynek/structlog/blob/main/tests/typing/api.py).
+- [ ] **New APIs** are added to our typing tests in [`api.py`](https://github.com/hynek/structlog/blob/main/tests/typing/api.py).
 - [ ] Updated **documentation** for changed code.
     - [ ] New functions/classes have to be added to `docs/api.rst` by hand.
     - [ ] Changed/added classes/methods/functions have appropriate `versionadded`, `versionchanged`, or `deprecated` [directives](http://www.sphinx-doc.org/en/stable/markup/para.html#directive-versionadded).
-
-      The next version is the second number in the current release + 1. The first number represents the current year. So if the current version on PyPI is 23.1.0, the next version is gonna be 23.2.0. If the next version is the first in the new year, it'll be 24.1.0.
+      - The next version is the second number in the current release + 1. The first number represents the current year. So if the current version on PyPI is 23.1.0, the next version is gonna be 23.2.0. If the next version is the first in the new year, it'll be 24.1.0.
 - [ ] Documentation in `.rst` and `.md` files is written using [**semantic newlines**](https://rhodesmill.org/brandon/2012/one-sentence-per-line/).
 - [ ] Changes (and possible deprecations) are documented in the [**changelog**](https://github.com/hynek/structlog/blob/main/CHANGELOG.md).
 - [ ] Consider granting [push permissions to the PR branch](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork), so maintainers can fix minor issues themselves without pestering you.

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -8,12 +8,12 @@ default_language_version:
 
 repos:
   - repo: https://github.com/psf/black
-    rev: 23.9.1
+    rev: 23.10.1
     hooks:
       - id: black
 
   - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.0.292
+    rev: v0.1.3
     hooks:
       - id: ruff
         args: [--fix, --exit-non-zero-on-fix]

diff --git a/.python-version-default b/.python-version-default
@@ -1 +1 @@
-3.11
+3.12
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -15,12 +15,18 @@ You can find our backwards-compatibility policy [here](https://github.com/hynek/
 
 ## [Unreleased](https://github.com/hynek/structlog/compare/23.2.0...HEAD)
 
+### Added
+
+- Async log methods (those starting with an `a`) now also support the collection of callsite information using `structlog.processors.CallsiteParameterAdder`.
+  [#565](https://github.com/hynek/structlog/issues/565)
+
+
 ### Fixed
 
 - The return value from `get_logger()` (a `BoundLoggerLazyProxy`) now passes `isinstance`-checks against `structlog.typing.BindableLogger` on Python 3.12.
   [#561](https://github.com/hynek/structlog/issues/561)
 
-- `structlog.threadlocal.tmp_bind()` now also works `BoundLoggerLazyProxy` (in other words: before anything is bound to a bound logger).
+- `structlog.threadlocal.tmp_bind()` now also works with `BoundLoggerLazyProxy` (in other words: before anything is bound to a bound logger).
 
 
 ## [23.2.0](https://github.com/hynek/structlog/compare/23.1.0...23.2.0) - 2023-10-09

diff --git a/docs/frameworks.md b/docs/frameworks.md
@@ -2,34 +2,34 @@
 
 To have consistent log output, it makes sense to configure *structlog* *before* any logging is done.
 The best place to perform your configuration varies with applications and frameworks.
-If you use standard library's logging, it makes sense to configure them next to each other.
+If you use standard library's {mod}`logging`, it makes sense to configure them next to each other.
 
 
-## OpenTelemetry
+## Celery
+
+[Celery](https://docs.celeryq.dev/)'s multi-process architecture leads unavoidably to race conditions that show up as interleaved logs.
+It ships standard library-based helpers in the form of [`celery.utils.log.get_task_logger()`](https://docs.celeryq.dev/en/stable/userguide/tasks.html#logging) that you should use inside of tasks to prevent that problem.
 
-The [Python *OpenTelemetry* SDK](https://opentelemetry.io/docs/instrumentation/python/) offers an easy API to get the current span, so you can enrich your logs with a straight-forward processor:
+The most straight-forward way to integrate that with *structlog* is using {doc}`standard-library` and wrapping that logger using {func}`structlog.wrap_logger`:
 
 ```python
-from opentelemetry import trace
+from celery.utils.log import get_task_logger
 
-def add_open_telemetry_spans(_, __, event_dict):
-    span = trace.get_current_span()
-    if not span.is_recording():
-        event_dict["span"] = None
-        return event_dict
+logger = structlog.wrap_logger(get_task_logger(__name__))
+```
 
-    ctx = span.get_span_context()
-    parent = getattr(span, "parent", None)
+If you want to automatically bind task metadata to your {doc}`contextvars`, you can use [Celery's signals](https://docs.celeryq.dev/en/stable/userguide/signals.html):
 
-    event_dict["span"] = {
-        "span_id": hex(ctx.span_id),
-        "trace_id": hex(ctx.trace_id),
-        "parent_span_id": None if not parent else hex(parent.span_id),
-    }
+```python
+from celery import signals
 
-    return event_dict
+@signals.task_prerun.connect
+def on_task_prerun(sender, task_id, task, args, kwargs, **_):
+    structlog.contextvars.bind_contextvars(task_id=task_id, task_name=task.name)
 ```
 
+See [this issue](https://github.com/hynek/structlog/issues/287) for more details.
+
 
 ## Django
 
@@ -67,11 +67,42 @@ request_started.connect(bind_request_details, app)
 ```
 
 
+## Litestar
+
+[Litestar](https://docs.litestar.dev/) comes with *structlog* support [out of the box](https://docs.litestar.dev/latest/usage/logging.html).
+
+
+## OpenTelemetry
+
+The [Python OpenTelemetry SDK](https://opentelemetry.io/docs/instrumentation/python/) offers an easy API to get the current span, so you can enrich your logs with a straight-forward processor:
+
+```python
+from opentelemetry import trace
+
+def add_open_telemetry_spans(_, __, event_dict):
+    span = trace.get_current_span()
+    if not span.is_recording():
+        event_dict["span"] = None
+        return event_dict
+
+    ctx = span.get_span_context()
+    parent = getattr(span, "parent", None)
+
+    event_dict["span"] = {
+        "span_id": hex(ctx.span_id),
+        "trace_id": hex(ctx.trace_id),
+        "parent_span_id": None if not parent else hex(parent.span_id),
+    }
+
+    return event_dict
+```
+
+
 ## Pyramid
 
 Configure it in the [application constructor](https://docs.pylonsproject.org/projects/pyramid/en/latest/narr/startup.html#the-startup-process).
 
-Here's an example for a *Pyramid* [*Tween*](https://kapeli.com/dash_share?docset_file=pyramid&docset_name=pyramid&path=narr/hooks.html%23registering-tweens&platform=pyramid) that stores various request-specific data into [*context variables*](contextvars.md):
+Here's an example for a Pyramid [*tween*](https://docs.pylonsproject.org/projects/pyramid/en/latest/glossary.html#term-tween) that stores various request-specific data into [*context variables*](contextvars.md):
 
 ```python
 @dataclass
@@ -92,32 +123,6 @@ class StructLogTween:
 ```
 
 
-## Celery
-
-[*Celery*](https://docs.celeryq.dev/)'s multi-process architecture leads unavoidably to race conditions that show up as interleaved logs.
-It ships standard library-based helpers in the form of [`celery.utils.log.get_task_logger()`](https://docs.celeryq.dev/en/stable/userguide/tasks.html#logging) that you should use inside of tasks to prevent that problem.
-
-The most straight-forward way to integrate that with *structlog* is using {doc}`standard-library` and wrapping that logger using {func}`structlog.wrap_logger`:
-
-```python
-from celery.utils.log import get_task_logger
-
-logger = structlog.wrap_logger(get_task_logger(__name__))
-```
-
-If you want to automatically bind task metadata to your {doc}`contextvars`, you can use [*Celery*'s signals](https://docs.celeryq.dev/en/stable/userguide/signals.html):
-
-```python
-from celery import signals
-
-@signals.task_prerun.connect
-def on_task_prerun(sender, task_id, task, args, kwargs, **_):
-    structlog.contextvars.bind_contextvars(task_id=task_id, task_name=task.name)
-```
-
-See [this issue](https://github.com/hynek/structlog/issues/287) for more details.
-
-
 ## Twisted
 
 The [plugin definition](https://docs.twisted.org/en/stable/core/howto/plugin.html) is the best place.

diff --git a/docs/performance.md b/docs/performance.md
@@ -30,15 +30,15 @@ Here are a few hints how to get the best performance out of *structlog* in produ
 
   1. Later calls of {func}`~structlog.configure` don't have any effect on already cached loggers -- that shouldn't matter outside of {doc}`testing <testing>` though.
   2. The resulting bound logger is not pickleable.
-      Therefore, you can't set this option if you e.g. plan on passing loggers around using `multiprocessing`.
+      Therefore, you can't set this option if you, for example, plan on passing loggers around using {mod}`multiprocessing`.
 
 - Avoid sending your log entries through the standard library if you can: its dynamic nature and flexibility make it a major bottleneck.
-  Instead use {class}`structlog.WriteLoggerFactory` or -- if your serializer returns bytes (e.g. [*orjson*]) -- {class}`structlog.BytesLoggerFactory`.
+  Instead use {class}`structlog.WriteLoggerFactory` or -- if your serializer returns bytes (for example, [*orjson*] or [*msgspec*]) -- {class}`structlog.BytesLoggerFactory`.
 
   You can still configure `logging` for packages that you don't control, but avoid it for your *own* log entries.
 
-- Use a faster JSON serializer than the standard library.
-  Possible alternatives are among others are [*orjson*] or [*RapidJSON*].
+- Configure {class}`~structlog.processors.JSONRenderer` to use a faster JSON serializer than the standard library.
+  Possible alternatives are among others are [*orjson*], [*msgspec*], or [RapidJSON](https://pypi.org/project/python-rapidjson/).
 
 - Be conscious about whether and how you use *structlog*'s *asyncio* support.
   While it's true that moving log processing into separate threads prevents your application from hanging, it also comes with a performance cost.
@@ -49,10 +49,11 @@ Here are a few hints how to get the best performance out of *structlog* in produ
 
 ## Example
 
-Here's an example for a production-ready non-*asyncio* *structlog* configuration that's as fast as it gets:
+Here's an example for a production-ready *structlog* configuration that's as fast as it gets:
 
 ```python
 import logging
+import orjson
 import structlog
 
 structlog.configure(
@@ -74,24 +75,23 @@ It has the following properties:
 - Caches all loggers on first use.
 - Filters all log entries below the `info` log level **very** efficiently.
   The `debug` method literally consists of `return None`.
-- Supports {doc}`contextvars` (thread-local contexts).
+- Supports {doc}`contextvars` (thread-local contexts outside of *asyncio*).
 - Adds the log level name.
-- Renders exceptions.
+- Renders exceptions into the `exception` key.
 - Adds an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) timestamp under the `timestamp` key in the UTC timezone.
-- Renders the log entries as JSON using [*orjson*] which is faster than plain logging in `logging`.
+- Renders the log entries as JSON using [*orjson*] which is faster than *plain* logging in {mod}`logging`.
 - Uses {class}`structlog.BytesLoggerFactory` because *orjson* returns bytes.
   That saves encoding ping-pong.
 
 Therefore a log entry might look like this:
 
 ```json
-{"event":"hello","timestamp":"2020-11-17T09:54:11.900066Z"}
+{"event":"hello","level":"info","timestamp":"2023-11-02T08:03:38.298565Z"}
 ```
 
 ---
 
 If you need standard library support for external projects, you can either just use a JSON formatter like [*python-json-logger*](https://pypi.org/project/python-json-logger/), or pipe them through *structlog* as documented in {doc}`standard-library`.
 
 [*orjson*]: https://github.com/ijl/orjson
-[*rapidjson*]: https://pypi.org/project/python-rapidjson/
-[*simplejson*]: https://simplejson.readthedocs.io/
+[*msgspec*]: https://jcristharif.com/msgspec/
diff --git a/src/structlog/_frames.py b/src/structlog/_frames.py
@@ -11,6 +11,7 @@
 from io import StringIO
 from types import FrameType
 
+from .contextvars import _ASYNC_CALLING_STACK
 from .typing import ExcInfo
 
 
@@ -50,7 +51,7 @@ def _find_first_app_frame_and_name(
         tuple of (frame, name)
     """
     ignores = ["structlog"] + (additional_ignores or [])
-    f = sys._getframe()
+    f = _ASYNC_CALLING_STACK.get(sys._getframe())
     name = f.f_globals.get("__name__") or "?"
     while any(tuple(name.startswith(i) for i in ignores)):
         if f.f_back is None:

diff --git a/src/structlog/_log_levels.py b/src/structlog/_log_levels.py
@@ -17,6 +17,7 @@
 from typing import Any, Callable
 
 from ._base import BoundLoggerBase
+from .contextvars import _ASYNC_CALLING_STACK
 from .typing import EventDict, FilteringBoundLogger
 
 
@@ -91,16 +92,26 @@ def exception(
 async def aexception(
     self: FilteringBoundLogger, event: str, *args: Any, **kw: Any
 ) -> Any:
+    """
+    .. versionchanged:: 23.3.0
+       Callsite parameters are now also collected under asyncio.
+    """
     # Exception info has to be extracted this early, because it is no longer
     # available once control is passed to the executor.
     if kw.get("exc_info", True) is True:
         kw["exc_info"] = sys.exc_info()
 
+    scs_token = _ASYNC_CALLING_STACK.set(sys._getframe().f_back)  # type: ignore[arg-type]
     ctx = contextvars.copy_context()
-    return await asyncio.get_running_loop().run_in_executor(
-        None,
-        lambda: ctx.run(lambda: self.error(event, *args, **kw)),
-    )
+    try:
+        runner = await asyncio.get_running_loop().run_in_executor(
+            None,
+            lambda: ctx.run(lambda: self.error(event, *args, **kw)),
+        )
+    finally:
+        _ASYNC_CALLING_STACK.reset(scs_token)
+
+    return runner
 
 
 def make_filtering_bound_logger(min_level: int) -> type[FilteringBoundLogger]:
@@ -170,16 +181,24 @@ def meth(self: Any, event: str, *args: Any, **kw: Any) -> Any:
             return self._proxy_to_logger(name, event % args, **kw)
 
         async def ameth(self: Any, event: str, *args: Any, **kw: Any) -> Any:
+            """
+            .. versionchanged:: 23.3.0
+               Callsite parameters are now also collected under asyncio.
+            """
             if args:
                 event = event % args
 
+            scs_token = _ASYNC_CALLING_STACK.set(sys._getframe().f_back)  # type: ignore[arg-type]
             ctx = contextvars.copy_context()
-            await asyncio.get_running_loop().run_in_executor(
-                None,
-                lambda: ctx.run(
-                    lambda: self._proxy_to_logger(name, event, **kw)
-                ),
-            )
+            try:
+                await asyncio.get_running_loop().run_in_executor(
+                    None,
+                    lambda: ctx.run(
+                        lambda: self._proxy_to_logger(name, event, **kw)
+                    ),
+                )
+            finally:
+                _ASYNC_CALLING_STACK.reset(scs_token)
 
         meth.__name__ = name
         ameth.__name__ = f"a{name}"
@@ -199,17 +218,28 @@ def log(self: Any, level: int, event: str, *args: Any, **kw: Any) -> Any:
     async def alog(
         self: Any, level: int, event: str, *args: Any, **kw: Any
     ) -> Any:
+        """
+        .. versionchanged:: 23.3.0
+           Callsite parameters are now also collected under asyncio.
+        """
         if level < min_level:
             return None
         name = _LEVEL_TO_NAME[level]
         if args:
             event = event % args
 
+        scs_token = _ASYNC_CALLING_STACK.set(sys._getframe().f_back)  # type: ignore[arg-type]
         ctx = contextvars.copy_context()
-        return await asyncio.get_running_loop().run_in_executor(
-            None,
-            lambda: ctx.run(lambda: self._proxy_to_logger(name, event, **kw)),
-        )
+        try:
+            runner = await asyncio.get_running_loop().run_in_executor(
+                None,
+                lambda: ctx.run(
+                    lambda: self._proxy_to_logger(name, event, **kw)
+                ),
+            )
+        finally:
+            _ASYNC_CALLING_STACK.reset(scs_token)
+        return runner
 
     meths: dict[str, Callable[..., Any]] = {"log": log, "alog": alog}
     for lvl, name in _LEVEL_TO_NAME.items():