Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

urljoin() undocumented behavior change in Python 3.14. #125926

Open
felixxm opened this issue Oct 24, 2024 · 6 comments
Open

urljoin() undocumented behavior change in Python 3.14. #125926

felixxm opened this issue Oct 24, 2024 · 6 comments
Assignees
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@felixxm
Copy link
Contributor

felixxm commented Oct 24, 2024

Bug report

Bug description:

Django is tested with the earliest alpha versions. We noticed a behavior change in the urllib.parse.urljoin() that is used in a few places in Django, e.g. for staticfiles or build-in storages.

Python 3.14.0a1:

>>> from urllib.parse import urljoin
>>> urljoin("/static/", "admin/img/icon-addlink.svg")
admin/img/icon-addlink.svg

Python 3.13 and earlier:

>>> from urllib.parse import urljoin
>>> urljoin("/static/", "admin/img/icon-addlink.svg")
/static/admin/img/icon-addlink.svg

Is this an intentional change?

CPython versions tested on:

3.14

Operating systems tested on:

Linux

Linked PRs

@felixxm felixxm added the type-bug An unexpected behavior, bug, or error label Oct 24, 2024
@nineteendo
Copy link
Contributor

I'm assuming this is bug fixed by #123273.

@felixxm
Copy link
Contributor Author

felixxm commented Oct 24, 2024

I'm assuming this is bug fixed by #123273.

#123273 is included in 3.14.0a1, so (unfortunately) it cannot address this issue.

@nineteendo
Copy link
Contributor

Sorry, I meant this might be caused by it.

@ZeroIntensity ZeroIntensity added stdlib Python modules in the Lib dir 3.14 new features, bugs and security fixes labels Oct 24, 2024
@felixxm
Copy link
Contributor Author

felixxm commented Oct 25, 2024

Sorry, I meant this might be caused by it.

I can bisect and confirm (or not) during the weekend.

@hugovk
Copy link
Member

hugovk commented Oct 25, 2024

Yes, git bisect points to fc897fc from #123273.

fc897fcc01964649f023e0baa4c95d142e4e8a10 is the first bad commit
commit fc897fcc01964649f023e0baa4c95d142e4e8a10
Date:   Sat Aug 31 12:42:08 2024 +0300

    gh-76960: Fix urljoin() and urldefrag() for URIs with empty components (GH-123273)

    * urljoin() with relative reference "?" sets empty query and removes fragment.
    * Preserve empty components (authority, params, query, fragment) in urljoin().
    * Preserve empty components (authority, params, query) in urldefrag().

    Also refactor the code and get rid of double _coerce_args() and
    _coerce_result() calls in urljoin(), urldefrag(), urlparse() and
    urlunparse().

 Lib/test/test_urlparse.py                                              |  87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------
 Lib/urllib/parse.py                                                    | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------------------------
 Misc/NEWS.d/next/Library/2024-08-23-22-01-30.gh-issue-76960.vsANPu.rst |   5 +++++
 3 files changed, 140 insertions(+), 52 deletions(-)
 create mode 100644 Misc/NEWS.d/next/Library/2024-08-23-22-01-30.gh-issue-76960.vsANPu.rst

@serhiy-storchaka serhiy-storchaka self-assigned this Oct 25, 2024
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 25, 2024
…ed authority

Although this goes beyond the application of RFC 3986, urljoin()
should support relative base URIs for backward compatibility.
serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 25, 2024
…ed authority

Although this goes beyond the application of RFC 3986, urljoin()
should support relative base URIs for backward compatibility.
@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Oct 25, 2024

urljoin() was updated to match the reference resolution algorithm in RFC 3986, Section 5. But the algorithm requires the base URI to be an absolute URI. This is why there were no tests for relative base URI. But urljoin() was used with relative base URIs, and it should continue to return sensible result for relative base URI, even if this goes beyond the application of RFC 3986.

serhiy-storchaka added a commit to serhiy-storchaka/cpython that referenced this issue Oct 25, 2024
…ed authority

Although this goes beyond the application of RFC 3986, urljoin()
should support relative base URIs for backward compatibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants