Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-124153: Introduce PyType_GetBaseByToken function (PoC) #121079

Closed
wants to merge 103 commits into from

Conversation

neonene
Copy link
Contributor

@neonene neonene commented Jun 27, 2024

Reference implementation of the following C-API functinons:

  • PyType_GetBaseByToken()
  • PyType_GetToken()

Discussion: https://discuss.python.org/t/55598


📚 Documentation preview 📚: https://cpython-previews--121079.org.readthedocs.build/

@neonene

This comment was marked as outdated.

@neonene
Copy link
Contributor Author

neonene commented Jul 3, 2024

The tp_bases can be used in a fallback implementation. I checked the overhead, adding a repeat function temporarily (100972d):

from timeit import timeit
setup = f"""if 1:
    import _testcapi
    A = _testcapi.create_type_with_token("_testcapi.A", 0)
    tokenA = _testcapi.get_tp_token(A)
    class B(A): pass
    class C(B): pass
    class D(C): pass
    class E(D): pass
    getbase = _testcapi.repeat_getbasebytoken
"""
c_repeat = 10  # py_repeat = timeit default (1000000)
mro   = timeit(s1 := f'getbase(C, tokenA, {c_repeat}, True)', setup)
bases = timeit(s2 := f'getbase(C, tokenA, {c_repeat}, False)', setup)
print(s1, mro)
print(s2, bases,  bases / mro)

Win non-debug: (the higher, the slower)

find A
from
starts
with
run once
in C
repeat 10
in C
mro class A 1.00 1.00
bases class A 1.00x 1.00x
mro class B 1.00x 1.05x
bases class B 1.01x 1.14x
mro class C 1.01x 1.10x
bases class C 1.04x 1.40x
mro class E 1.02x 1.16x
bases class E 1.08x 1.50x

@neonene neonene changed the title Introduce PyType_GetBaseByToken function and friends Introduce PyType_GetBaseByToken function Jul 3, 2024
This will keep the slowdown by up to 2% on the `telco` benchmark (PGO).  Unlike the `PyDecContextObject`, extending the `PyDecObject` struct seems to affect only binary ops and seems to be a waste of memory.
Faster than the upstream by up to 2% on the `telco` benchmarks (PGO/non-PGO).  Based on the GetBaseByToken() optimization
by ac82d36.
Keeps the performance unchanged even if the private function is not inlined (i.e. not trained well on PGO).
PyType_GetBaseByToken() fails to inline the wrapped ptivate function, whose overhead appears to be not ignorable.
This cleanup can cause a slowdown by 10% on the `telco` benchmark for some reason.
This version sets the *result to NULL at the end to reduce the overhead of double memory acces when returning true. Under verification.
Copy link
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good. Do you want to clear the draft bit, and file an issue?

If there are performance more tweaks to make, they can go in a follow-up PR.

PyErr_Format(PyExc_TypeError, "expected a ctypes type, got '%N'", type);
return NULL;
}
exercise_get_base_by_token(PyCType_Type);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the exercise from this PR?
Hopefully the training will get better as PyType_GetBaseByToken is used more; if not, we can adjust it in a future PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll post a new PR. It may be better to have a dedicated branch when the result argument is NULL.

@neonene neonene changed the title Introduce PyType_GetBaseByToken function gh-124153: Introduce PyType_GetBaseByToken function (PoC) Sep 17, 2024
@neonene neonene closed this Sep 20, 2024
@neonene neonene deleted the bytoken branch September 20, 2024 03:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants