-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-124153: Introduce PyType_GetBaseByToken function (PoC) #121079
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
The from timeit import timeit
setup = f"""if 1:
import _testcapi
A = _testcapi.create_type_with_token("_testcapi.A", 0)
tokenA = _testcapi.get_tp_token(A)
class B(A): pass
class C(B): pass
class D(C): pass
class E(D): pass
getbase = _testcapi.repeat_getbasebytoken
"""
c_repeat = 10 # py_repeat = timeit default (1000000)
mro = timeit(s1 := f'getbase(C, tokenA, {c_repeat}, True)', setup)
bases = timeit(s2 := f'getbase(C, tokenA, {c_repeat}, False)', setup)
print(s1, mro)
print(s2, bases, bases / mro) Win non-debug: (the higher, the slower)
|
This will keep the slowdown by up to 2% on the `telco` benchmark (PGO). Unlike the `PyDecContextObject`, extending the `PyDecObject` struct seems to affect only binary ops and seems to be a waste of memory.
Faster than the upstream by up to 2% on the `telco` benchmarks (PGO/non-PGO). Based on the GetBaseByToken() optimization by ac82d36.
Keeps the performance unchanged even if the private function is not inlined (i.e. not trained well on PGO).
PyType_GetBaseByToken() fails to inline the wrapped ptivate function, whose overhead appears to be not ignorable.
This cleanup can cause a slowdown by 10% on the `telco` benchmark for some reason.
This version sets the *result to NULL at the end to reduce the overhead of double memory acces when returning true. Under verification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good. Do you want to clear the draft bit, and file an issue?
If there are performance more tweaks to make, they can go in a follow-up PR.
Modules/_ctypes/ctypes.h
Outdated
PyErr_Format(PyExc_TypeError, "expected a ctypes type, got '%N'", type); | ||
return NULL; | ||
} | ||
exercise_get_base_by_token(PyCType_Type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the exercise from this PR?
Hopefully the training will get better as PyType_GetBaseByToken
is used more; if not, we can adjust it in a future PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll post a new PR. It may be better to have a dedicated branch when the result
argument is NULL
.
Reference implementation of the following C-API functinons:
PyType_GetBaseByToken()
PyType_GetToken()
Discussion: https://discuss.python.org/t/55598
📚 Documentation preview 📚: https://cpython-previews--121079.org.readthedocs.build/
PyType_GetBaseByToken
function withPy_tp_token
slot #124153