Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--retries has no effect during streaming downloads #12383

Open
1 task done
asottile-sentry opened this issue Nov 3, 2023 · 1 comment
Open
1 task done

--retries has no effect during streaming downloads #12383

asottile-sentry opened this issue Nov 3, 2023 · 1 comment
Labels
S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior

Comments

@asottile-sentry
Copy link

Description

a few things up front since I've found a few related issues, including one that's almost certainly an exact duplicate however the advice there doesn't seem like an exact match to my case:

in my case I am seeing this from github actions both against public pypi and against an "internal" but public-facing wheeling mirror at https://pypi.devinfra.sentry.io/simple . I'm using the default timeout (15 seconds) which for most connections is fast enough (typical network speeds between GHA and public pypi or our pypi server are about 50MBps -- easily downloading even the bulkiest packages in a second or two). unfortunately GitHub's network is significantly flaky -- and simple retries would help immensely for these downloads (even if they started over from the beginning as #4796)

there was rationale given in the duplicate above here that retrying ReadTimeOutErrors would lead to excessively long waiting -- however pip seems to already retry them in cases which aren't streamed responses -- here's an example where I've artificially reduced the public pypi timeout low enough and you can see the retries on ReadTimeOutError:

$ pip install --timeout .01 cfgv
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<pip._vendor.urllib3.connection.HTTPSConnection object at 0x7f3759653190>: Failed to establish a new connection: [Errno 101] Network is unreachable')': /simple/cfgv/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=0.01)")': /simple/cfgv/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=0.01)")': /simple/cfgv/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=0.01)")': /simple/cfgv/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=0.01)")': /simple/cfgv/
ERROR: Could not find a version that satisfies the requirement cfgv (from versions: none)
ERROR: No matching distribution found for cfgv

worst case I can retry the entire pip installation -- but this seems like quite a heavy hammer for what should essentially be a bunch of file downloads


related: it seems --timeout also doesn't affect streamed downloads:

$ time pip download --no-cache-dir --timeout .25 torch --no-deps
Collecting torch
  Obtaining dependency information for torch from https://files.pythonhosted.org/packages/e1/24/f7fe3fe82583e6891cc3fceeb390f192f6c7f1d87e5a99a949ed33c96167/torch-2.1.0-cp38-cp38-manylinux1_x86_64.whl.metadata
  Downloading torch-2.1.0-cp38-cp38-manylinux1_x86_64.whl.metadata (25 kB)
Downloading torch-2.1.0-cp38-cp38-manylinux1_x86_64.whl (670.2 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 670.2/670.2 MB 34.6 MB/s eta 0:00:00
Saved ./torch-2.1.0-cp38-cp38-manylinux1_x86_64.whl
Successfully downloaded torch

real	0m22.157s
user	0m6.681s
sys	0m3.720s

Expected behavior

I expect a retry to occur when streaming a response (wheel / sdist / archive download) when the connection is stalled due to intermittent network failures leading to ReadTimeOut

pip version

22.1.2 -- also reproduced on latest (23.3.1)

Python version

3.8.16

OS

ubuntu 22.04 (ubuntu-latest in GHA)

How to Reproduce

I am using this requirements file -- however I can reproduce it against public pypi with enough attempts (github actions network is quite flaky unfortunately!)

effectively I'm running pip install -r requirements-dev-frozen.txt

Output

going to hide this one because it's a bit long

the relevant error is:

  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 443, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.")
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='pypi.devinfra.sentry.io', port=443): Read timed out.
$ pip install -r requirements-dev-frozen.txt
Looking in indexes: https://pypi.devinfra.sentry.io/simple
Collecting aiohttp==3.8.5
  Downloading https://pypi.devinfra.sentry.io/wheels/aiohttp-3.8.5-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 29.5 MB/s eta 0:00:00
Collecting aiosignal==1.3.1
  Downloading https://pypi.devinfra.sentry.io/wheels/aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting amqp==2.6.1
  Downloading https://pypi.devinfra.sentry.io/wheels/amqp-2.6.1-py2.py3-none-any.whl (48 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.0/48.0 KB 14.0 MB/s eta 0:00:00
Collecting asgiref==3.7.2
  Downloading https://pypi.devinfra.sentry.io/wheels/asgiref-3.7.2-py3-none-any.whl (24 kB)
Collecting async-generator==1.10
  Downloading https://pypi.devinfra.sentry.io/wheels/async_generator-1.10-py3-none-any.whl (18 kB)
Collecting async-timeout==4.0.2
  Downloading https://pypi.devinfra.sentry.io/wheels/async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting attrs==19.2.0
  Downloading https://pypi.devinfra.sentry.io/wheels/attrs-19.2.0-py2.py3-none-any.whl (40 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 40.7/40.7 KB 15.5 MB/s eta 0:00:00
Collecting avalara==20.9.0
  Downloading https://pypi.devinfra.sentry.io/wheels/Avalara-20.9.0-py3-none-any.whl (61 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.0/62.0 KB 16.5 MB/s eta 0:00:00
Collecting beautifulsoup4==4.7.1
  Downloading https://pypi.devinfra.sentry.io/wheels/beautifulsoup4-4.7.1-py3-none-any.whl (94 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.3/94.3 KB 36.6 MB/s eta 0:00:00
Collecting billiard==3.6.4.0
  Downloading https://pypi.devinfra.sentry.io/wheels/billiard-3.6.4.0-py3-none-any.whl (89 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.5/89.5 KB 29.1 MB/s eta 0:00:00
Collecting black==22.10.0
  Downloading https://pypi.devinfra.sentry.io/wheels/black-22.10.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 72.5 MB/s eta 0:00:00
Collecting boto3==1.28.26
  Downloading https://pypi.devinfra.sentry.io/wheels/boto3-1.28.26-py3-none-any.whl (135 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 135.8/135.8 KB 44.6 MB/s eta 0:00:00
Collecting botocore==1.31.26
  Downloading https://pypi.devinfra.sentry.io/wheels/botocore-1.31.26-py3-none-any.whl (11.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.1/11.1 MB 50.7 MB/s eta 0:00:00
Collecting brotli==1.0.9
  Downloading https://pypi.devinfra.sentry.io/wheels/Brotli-1.0.9-cp38-cp38-manylinux1_x86_64.whl (357 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 357.2/357.2 KB 7.0 MB/s eta 0:00:00
Collecting build==0.8.0
  Downloading https://pypi.devinfra.sentry.io/wheels/build-0.8.0-py3-none-any.whl (17 kB)
Collecting cachetools==5.3.0
  Downloading https://pypi.devinfra.sentry.io/wheels/cachetools-5.3.0-py3-none-any.whl (9.3 kB)
Collecting celery==4.4.7
  Downloading https://pypi.devinfra.sentry.io/wheels/celery-4.4.7-py2.py3-none-any.whl (427 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 427.6/427.6 KB 90.3 MB/s eta 0:00:00
Collecting certifi==2023.7.22
  Downloading https://pypi.devinfra.sentry.io/wheels/certifi-2023.7.22-py3-none-any.whl (158 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 158.3/158.3 KB 50.7 MB/s eta 0:00:00
Collecting cffi==1.15.1
  Downloading https://pypi.devinfra.sentry.io/wheels/cffi-1.15.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (442 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 442.7/442.7 KB 83.7 MB/s eta 0:00:00
Collecting cfgv==3.3.1
  Downloading https://pypi.devinfra.sentry.io/wheels/cfgv-3.3.1-py2.py3-none-any.whl (7.3 kB)
Collecting charset-normalizer==3.0.1
  Downloading https://pypi.devinfra.sentry.io/wheels/charset_normalizer-3.0.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (195 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 195.4/195.4 KB 50.4 MB/s eta 0:00:00
Collecting click==8.0.4
  Downloading https://pypi.devinfra.sentry.io/wheels/click-8.0.4-py3-none-any.whl (97 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.5/97.5 KB 31.9 MB/s eta 0:00:00
Collecting confluent-kafka==2.1.1
  Downloading https://pypi.devinfra.sentry.io/wheels/confluent_kafka-2.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.9/3.9 MB 79.9 MB/s eta 0:00:00
Collecting covdefaults==2.3.0
  Downloading https://pypi.devinfra.sentry.io/wheels/covdefaults-2.3.0-py2.py3-none-any.whl (5.1 kB)
Collecting coverage==6.3.3
  Downloading https://pypi.devinfra.sentry.io/wheels/coverage-6.3.3-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 212.2/212.2 KB 66.0 MB/s eta 0:00:00
Collecting croniter==1.3.10
  Downloading https://pypi.devinfra.sentry.io/wheels/croniter-1.3.10-py2.py3-none-any.whl (18 kB)
Collecting cryptography==39.0.1
  Downloading https://pypi.devinfra.sentry.io/wheels/cryptography-39.0.1-cp36-abi3-manylinux_2_28_x86_64.whl (4.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.2/4.2 MB 70.4 MB/s eta 0:00:00
Collecting cssselect==1.0.3
  Downloading https://pypi.devinfra.sentry.io/wheels/cssselect-1.0.3-py2.py3-none-any.whl (16 kB)
Collecting cssutils==2.4.0
  Downloading https://pypi.devinfra.sentry.io/wheels/cssutils-2.4.0-py3-none-any.whl (404 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 405.0/405.0 KB 77.2 MB/s eta 0:00:00
Collecting datadog==0.29.3
  Downloading https://pypi.devinfra.sentry.io/wheels/datadog-0.29.3-py2.py3-none-any.whl (72 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.9/72.9 KB 24.6 MB/s eta 0:00:00
Collecting decorator==5.1.1
  Downloading https://pypi.devinfra.sentry.io/wheels/decorator-5.1.1-py3-none-any.whl (9.1 kB)
Collecting dictpath==0.1.3
  Downloading https://pypi.devinfra.sentry.io/wheels/dictpath-0.1.3-py3-none-any.whl (8.4 kB)
Collecting distlib==0.3.4
  Downloading https://pypi.devinfra.sentry.io/wheels/distlib-0.3.4-py2.py3-none-any.whl (461 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 461.2/461.2 KB 89.3 MB/s eta 0:00:00
Collecting django==3.2.23
  Downloading https://pypi.devinfra.sentry.io/wheels/Django-3.2.23-py3-none-any.whl (7.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━                   4.4/7.9 MB 212.7 MB/s eta 0:00:01
ERROR: Exception:
Traceback (most recent call last):
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 438, in _error_catcher
    yield
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 519, in read
    data = self._fp.read(amt) if not fp_closed else b""
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/cachecontrol/filewrapper.py", line 90, in read
    data = self.__fp.read(amt)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/http/client.py", line 459, in read
    n = self.readinto(b)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/http/client.py", line 503, in readinto
    n = self.fp.readinto(b)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 167, in exc_logging_wrapper
    status = run_func(*args)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/cli/req_command.py", line 205, in wrapper
    return func(self, options, args)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/commands/install.py", line 339, in run
    requirement_set = resolver.resolve(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/resolver.py", line 94, in resolve
    result = self._result = resolver.resolve(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
    state = resolution.resolve(requirements, max_rounds=max_rounds)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
    self._add_to_criteria(self.state.criteria, r, parent=None)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
    if not criterion.candidates:
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
    return bool(self._sequence)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
    return any(self)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
    return (c for c in iterator if id(c) not in self._incompatible_ids)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 47, in _iter_built
    candidate = func()
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/factory.py", line 215, in _make_candidate_from_link
    self._link_candidate_cache[link] = LinkCandidate(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 288, in __init__
    super().__init__(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 158, in __init__
    self.dist = self._prepare()
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 227, in _prepare
    dist = self._prepare_distribution()
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/resolution/resolvelib/candidates.py", line 299, in _prepare_distribution
    return preparer.prepare_linked_requirement(self._ireq, parallel_builds=True)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 487, in prepare_linked_requirement
    return self._prepare_linked_requirement(req, parallel_builds)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 532, in _prepare_linked_requirement
    local_file = unpack_url(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 214, in unpack_url
    file = get_http_url(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/operations/prepare.py", line 94, in get_http_url
    from_path, content_type = download(link, temp_dir.path)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/network/download.py", line 146, in __call__
    for chunk in chunks:
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/cli/progress_bars.py", line 304, in _rich_progress_bar
    for chunk in iterable:
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_internal/network/utils.py", line 63, in response_chunks
    for chunk in response.raw.stream(
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 576, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 541, in read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
  File "/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/runner/work/sentry/sentry/.venv/lib/python3.8/site-packages/pip/_vendor/urllib3/response.py", line 443, in _error_catcher
    raise ReadTimeoutError(self._pool, None, "Read timed out.")
pip._vendor.urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='pypi.devinfra.sentry.io', port=443): Read timed out.

Code of Conduct

@asottile-sentry asottile-sentry added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Nov 3, 2023
@asottile-sentry
Copy link
Author

this is absolutely not the correct patch but it does satisfy my requirements for now

diff --git a/src/pip/_internal/network/download.py b/src/pip/_internal/network/download.py
index d1d43541e..4919f3e0a 100644
--- a/src/pip/_internal/network/download.py
+++ b/src/pip/_internal/network/download.py
@@ -7,6 +7,7 @@ import os
 from typing import Iterable, Optional, Tuple
 
 from pip._vendor.requests.models import CONTENT_CHUNK_SIZE, Response
+from pip._vendor.tenacity import retry, stop_after_attempt
 
 from pip._internal.cli.progress_bars import get_download_progress_renderer
 from pip._internal.exceptions import NetworkConnectionError
@@ -128,6 +129,7 @@ class Downloader:
         self._session = session
         self._progress_bar = progress_bar
 
+    @retry(reraise=True, stop=stop_after_attempt(5))
     def __call__(self, link: Link, location: str) -> Tuple[str, str]:
         """Download the file given by link into location."""
         try:

asottile-sentry added a commit to getsentry/sentry that referenced this issue Nov 20, 2023
pypa/pip#12383 (comment)

this is a terrible hack -- essentially:
- write a small `.pth` file to monkeypatch pip for any calls into the
virtualenv site-packages
- add a retrier to `Downloader.__call__`
- instrument all of our virtualenv creation with it
k-fish pushed a commit to getsentry/sentry that referenced this issue Nov 21, 2023
pypa/pip#12383 (comment)

this is a terrible hack -- essentially:
- write a small `.pth` file to monkeypatch pip for any calls into the
virtualenv site-packages
- add a retrier to `Downloader.__call__`
- instrument all of our virtualenv creation with it
armenzg pushed a commit to getsentry/sentry that referenced this issue Nov 27, 2023
pypa/pip#12383 (comment)

this is a terrible hack -- essentially:
- write a small `.pth` file to monkeypatch pip for any calls into the
virtualenv site-packages
- add a retrier to `Downloader.__call__`
- instrument all of our virtualenv creation with it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

No branches or pull requests

1 participant