-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace RequestsFetcher for Urllib3Fetcher #2762
base: develop
Are you sure you want to change the base?
replace RequestsFetcher for Urllib3Fetcher #2762
Conversation
… ngclient, replace requestsFecther with urllibFetcher in requestsFetcher unit tests. Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Hi @jku, I implemented a FetcherInterface that utilizes urllib3, most of the implementation is quite similar to RequestsFetcher. For the tests, I swapped out the RequestsFetcher for the Urllib3Fetcher in |
On first pass this looks really good, thanks -- I was worried there might be some differences that makes this complicated but your work seems very promising. Still, we'll have to be careful not to change any details we don't intend to as this is very core client code that easily affects applications (I mention this in case the review takes a while: it's only because I know this is a potentially tricky change). I think we can probably even drop the multiple poolmanagers: this manual connection pool management was added at a time when this wasn't part of the http library but now I believe the poolmanager handles exactly the issues the _get_poolmanager() code is working around. I'm going to review this better in the next days so I'll look into this. |
@@ -102,7 +102,7 @@ def __init__( | |||
if fetcher is not None: | |||
self._fetcher = fetcher | |||
else: | |||
self._fetcher = requests_fetcher.RequestsFetcher( | |||
self._fetcher = urllib3_fetcher.Urllib3Fetcher( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we change the default fetcher already in this PR then we'd want to change the requirements as well (pyprojects.toml, requirements/main.txt and regenerate requirements/pinned.txt) but we can still discuss if there's benefit to shipping this as alternative (non-default) fetcher first. Maybe there isn't -- if we don't think anyone is going to use and test it as an alternative then our only real choice is to swap the defaults
However, I think this change in dependencies would remove certifi from the indirect dependencies -- we'll have to take look at that and decide what is correct for python-tuf:
- what should the default certificate source be (certifi is the status quo but we may need a few lines of code to get the connectionpool to use it)
- do we allow users to change this default? how?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may need a few lines of code to get the connectionpool to use it
this is correct: by default urllib3 uses the OS certificate store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to swap out the defaults in this PR, if it isn't too much overhead for changing the requirements. For the certificate source, I think it makes sense to utilize certifi
instead of the OS certificate source, based on this page it seems like a quick swap.
sounds good, I was looking at this page and I think it is supported, so I can remove _get_poolmanager() / _poolManagers and just use one of them. I was wondering if I should pass in a different value for the num_pools parameter, or if the default amount of 10 works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some feedback now:
- let's leave the certificate store decision open for now: William has a good point that certifi is problematic in the linked sigstore-python issue...
- I believe we can drop the manual session management and just use a single PoolManager
- The timeouts need a bit of work: we are supposed to handle timeouts specifically so we can raise SlowRetrievalError... but it looks to me like that will actually mean little more work with urllib3 (since by default we get automatic retries which I believe means any timeouts are actually wapped in MaxRetryError)
- I think we could add some error parsing in _fetch and _chunks to only raise SlowRetrievalError on these cases
- but on the other hand I don't want complex code for a feature I don't really think is useful (defence against slow retrieval attacks is not really possible at library level IMO) -- so let's not overdo this
For the record I was looking at the timeout results with these simple tests:
(note that this simple test reuses the default connection pool so a successful connection will prevent retesting the connection timeout without reloading the module) |
Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Signed-off-by: NicholasTanz <[email protected]>
Description of the changes being introduced by the pull request:
Fixes #2751