Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-124694: Add concurrent.futures.InterpreterPoolExecutor #124548

Merged

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented Sep 25, 2024

This is an implementation of InterpreterPoolExecutor that builds on ThreadPoolExecutor.

This assumes that we're okay adding the executor separately from PEP 734. That PEP is about adding a new stdlib module, which is a separate matter from adding the new executor.

Possible future improvements:

  • support passing (most) arbitrary functions without pickling
  • support passing closures
  • optionally exec functions against __main__ instead of the their original module

CC @brianquinlan

@ericsnowcurrently ericsnowcurrently changed the title Add concurrent.futures.InterpreterPoolExecutor gh-124694: Add concurrent.futures.InterpreterPoolExecutor Sep 27, 2024
@ericsnowcurrently ericsnowcurrently marked this pull request as ready for review September 27, 2024 22:37
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Doc/library/concurrent.futures.rst Outdated Show resolved Hide resolved
Lib/concurrent/futures/interpreter.py Outdated Show resolved Hide resolved
@ericsnowcurrently
Copy link
Member Author

@ZeroIntensity, I've fixed all those Docs things.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. A small nitpick is that it might be a good idea to mention textwrap.dedent in the docs for initializer -- I'm worried that users might run into pesky indentation problems when passing scripts, and textwrap.dedent is a nice way to deal with that.

@ericsnowcurrently
Copy link
Member Author

Good point. It would make sense to automatically call it for users.

@ericsnowcurrently
Copy link
Member Author

@brianquinlan, what are your thoughts on this?

@brianquinlan
Copy link
Contributor

@brianquinlan, what are your thoughts on this?

@ericsnowcurrently I haven't been active for yours but this seems like an excellent addition.

@ericsnowcurrently
Copy link
Member Author

I'm merging this once it passes, in the interest of unblocking other stuff I'm working on. I'm sure the docs aren't quite where we might want them, but we can iterate on them separately.

@ericsnowcurrently ericsnowcurrently merged commit a5a7f5e into python:main Oct 16, 2024
37 checks passed
@ericsnowcurrently ericsnowcurrently deleted the interpreter-pool-executor branch October 16, 2024 22:50
@ericsnowcurrently
Copy link
Member Author

ericsnowcurrently commented Oct 17, 2024

This may have introduced a refleak crash on one of the refleak buildbots: https://buildbot.python.org/#/builders/259/builds/1528. I'm taking a look.

@ZeroIntensity
Copy link
Member

Yeah, I noticed the tests for InterpreterPoolExecutor segfaulting on someone elses PR yesterday on something seemingly unrelated. Since this is pure-Python, _interpreters might be buggier than we think :(

@ericsnowcurrently
Copy link
Member Author

In the crash I linked to, it happened when calling _interpqueues.create(), so not _interpreters.

@ericsnowcurrently
Copy link
Member Author

FWIW, the only other failure I've seen with this on refleak buildbots is AMD64 FreeBSD Refleaks 3.x, with 3 successful runs after one failure, which looks like it hung.

The other one I mentioned was on AMD64 RHEL8 Refleaks 3.x, and has had multiple successful runs to go with that one crash.

I don't see any other failures for this on other stable buildbots.

@ericsnowcurrently
Copy link
Member Author

I also haven't been able to reproduce any crash or hang locally yet.

@ZeroIntensity
Copy link
Member

I guess it's possible that this only affects AMD?

@ericsnowcurrently
Copy link
Member Author

It's probably a race due to some load profile that has only shown up there.

@ZeroIntensity
Copy link
Member

I'll run the tests under valgrind to see if it picks anything up. That will take a while, though. Do you want to revert this in the meantime?

@ZeroIntensity
Copy link
Member

ZeroIntensity commented Oct 17, 2024

Ah, @ericsnowcurrently, I think I found the problem. In most of the _interpqueues methods, something like this is passed to PyArg* as a converter:

qidarg_converter_data qidarg;

As far as I can tell, default struct initialization is not C standard, so it's values are just junk stack memory. I'm guessing that x86 has some sort of detail that sets stack-allocated structs to NULL, but not on ARM--that's why it's only failing there.

Edit: Oh wait, it's AMD, not ARM. I guess it's a chip-specific issue then, but the point is that it's UB.

@ericsnowcurrently
Copy link
Member Author

Hmm, I'll take a look.

@ericsnowcurrently
Copy link
Member Author

Yeah, that could very well be it. The "label" field would sometimes be a problem if not initialized. I'll fix that.

@ericsnowcurrently
Copy link
Member Author

I've opened gh-125667.

@ZeroIntensity
Copy link
Member

Great, I'll review your PR whenever you get to it.

@ericsnowcurrently
Copy link
Member Author

gh-125668

@ericsnowcurrently
Copy link
Member Author

I also noticed a failure on the Android buildbot: https://buildbot.python.org/#/builders/1594/builds/338. I'm looking into it.

@ericsnowcurrently
Copy link
Member Author

#125708

@ericsnowcurrently
Copy link
Member Author

Looks like there's more to do: https://buildbot.python.org/#/builders/1610/builds/198 (AMD64 CentOS9 NoGIL Refleaks). I'm looking into it.

@ericsnowcurrently
Copy link
Member Author

I've opened a new issue to deal with this, so we don't keep going on here: #125716.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants