Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] GH-65238: Preserve trailing slash in pathlib #112363

Closed
wants to merge 31 commits into from

Conversation

barneygale
Copy link
Contributor

@barneygale barneygale commented Nov 24, 2023

Fix the last known case where pathlib can mangle the meaning of a path. This brings pathlib in line with IEEE Std 1003.1-2017, where trailing slashes are meaningful to path resolution and should not be discarded.

Changes

In several important respects, paths with trailing slashes behave differently to their slash-less counterparts:

  • Paths with and without trailing slashes compare unequal and generate different hash codes
  • __str__(), __fspath__() and related representations include any trailing slash
  • glob() patterns ending with a slash will now generate results ending with a slash, matching glob.glob() behaviour
  • match() now observes trailing slashes, and so its pattern language exactly matches that of glob().

To manipulate a trailing slash, we add these new methods/properties:

  • has_trailing_sep - read-only boolean indicating whether a trailing slash is present
  • with_trailing_sep() - returns a new path with a trailing slash present
  • without_trailing_sep() - returns a new path with a trailing slash omitted

Backwards compatibility

Empty segments given to the PurePath initialiser do not generate new segments, so str(PurePath("foo", "")) results in "foo", not "foo/".

Methods concerned with dirnames and basenames ignore any trailing slash:

  • name, stem, suffix and suffixes retrieve the last non-empty path segment, and so any trailing slash are ignored
  • with_name(), with_stem() and with_suffix() replace the last non-empty path segment
  • parent and parents ignore any trailing slash
  • relative_to() and is_relative_to() ignore any trailing slashes in self and other.
  • The .parts tuple works exactly as before, and doesn't distinguish paths with trailing separators
    • Not sure if this is right tbh

Dependencies

Future work

Once this lands, we can take advantage of the fact that pathlib does not mangle paths. Thus:


@barneygale barneygale closed this Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant