-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-102471, PEP 757: Add PyLong import and export API #121339
base: main
Are you sure you want to change the base?
Conversation
See also issue #111415 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my 2c.
The gmpy2 code, used for benchmarking, can be found in my fork:
https://github.com/skirpichev/gmpy/tree/trying-py-import-export
Objects/longobject.c
Outdated
PyUnstable_Long_Export(PyObject *obj, PyUnstable_LongExport *long_export) | ||
{ | ||
if (!PyLong_Check(obj)) { | ||
PyErr_Format(PyExc_TypeError, "expect int, got %T", obj); | ||
return -1; | ||
} | ||
PyLongObject *self = (PyLongObject*)obj; | ||
|
||
long_export->obj = (PyLongObject*)Py_NewRef(obj); | ||
long_export->negative = _PyLong_IsNegative(self); | ||
long_export->ndigits = _PyLong_DigitCount(self); | ||
if (long_export->ndigits == 0) { | ||
long_export->ndigits = 1; | ||
} | ||
long_export->digits = self->long_value.ob_digit; | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this mostly give a direct access to the PyLongObject - it almost as fast as using private stuff before.
Old code:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**2' 'mpz(x)'
1000000 loops, best of 20: 232 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=10**100' 'mpz(x)'
500000 loops, best of 11: 500 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**1000' 'mpz(x)'
100000 loops, best of 20: 2.53 usec per loop
With proposed API:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**2' 'mpz(x)'
1000000 loops, best of 20: 258 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**100' 'mpz(x)'
500000 loops, best of 20: 528 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=10**1000' 'mpz(x)'
100000 loops, best of 20: 2.56 usec per loop
Objects/longobject.c
Outdated
PyObject* | ||
PyUnstable_Long_Import(int negative, size_t ndigits, Py_digit *digits) | ||
{ | ||
return (PyObject*)_PyLong_FromDigits(negative, ndigits, digits); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is something I would like to avoid. This requires allocation of a temporary buffer and using memcpy. Can we offer a writable layout to use it's digits in the mpz_export directly?
Benchmarks for old code:
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 11: 111 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 11: 475 nsec per loop
$ python -m timeit -r11 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 11: 2.39 usec per loop
With new API:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 20: 111 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 20: 578 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 20: 2.53 usec per loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires allocation of a temporary buffer and using memcpy.
Right, PyLongObject has to manage its own memory.
Can we offer a writable layout to use it's digits in the mpz_export directly?
That sounds strange from the Python point of view and make the internals "less opaque". I would prefer to leak less implementation details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, PyLongObject has to manage its own memory.
I'm not trying to change that. More complete proposal: vstinner#4
gmpy2 patch: https://github.com/skirpichev/gmpy/tree/trying-py-import-export-v2
New benchmarks:
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**2)' 'int(x)'
2000000 loops, best of 20: 111 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**100)' 'int(x)'
500000 loops, best of 20: 509 nsec per loop
$ python -m timeit -r20 -s 'from gmpy2 import mpz;x=mpz(10**1000)' 'int(x)'
100000 loops, best of 20: 2.44 usec per loop
I would prefer to leak less implementation details.
I don't think this leak anything. It doesn't leak memory management details. PyLong_Import
will just do allocation memory. Writting digits will be job for mpz_export, as before.
Without this, it seems - there are noticeable performance regression for integers of intermediate range. Something up to 20% vs 7% on my branch.
Edit: currently, proposed PyUnstable_Long_ReleaseImport()
match PyUnstable_Long_ReleaseExport()
. Perhaps, it could be one function (say, PyUnstable_Long_ReleaseDigitArray()), but I unsure - maybe it puts some constraints on internals of the PyLongObject.
CC @tornaria, as Sage people might be interested in this feature. |
CC @oscarbenjamin, you may want this for python-flint |
I updated my PR:
|
Misc/NEWS.d/next/C API/2024-07-03-17-26-53.gh-issue-102471.XpmKYk.rst
Outdated
Show resolved
Hide resolved
Absolutely. Currently python-flint uses a hex-string as an intermediate format when converting between large int and |
@skirpichev: I added a PyLongWriter API similar to what @encukou proposed. Example: PyLongObject *
_PyLong_FromDigits(int negative, Py_ssize_t digit_count, digit *digits)
{
PyLongWriter *writer = PyLongWriter_Create();
if (writer == NULL) {
return NULL;
}
if (negative) {
PyLongWriter_SetSign(writer, -1);
}
Py_digit *writer_digits = PyLongWriter_AllocDigits(writer, digit_count);
if (writer_digits == NULL) {
goto error;
}
memcpy(writer_digits, digits, digit_count * sizeof(digit));
return (PyLongObject*)PyLongWriter_Finish(writer);
error:
PyLongWriter_Discard(writer);
return NULL;
} The >>> import _testcapi; _testcapi.pylong_import(0, [100, 0, 0]) is 100
True |
I mark the PR as a draft until we agree on the API. |
Yes, that looks better and should fix speed regression. I'll try to benchmark that, perhaps tomorrow. But cost is 5 (!) public functions and one new struct, additionally to |
I updated the PR to remove the |
My concern is to avoid the problem capi-workgroup/api-evolution#36 : avoid exposing |
@skirpichev: Would it be useful to add a |
On Tue, Sep 17, 2024 at 04:44:15AM -0700, Victor Stinner wrote:
Do you have concerns about Windows where long is only 32-bit?
Yes, we support Windows (
For new APIs, I would prefer to use a type with a known size: int32_t,
int64_t, etc.
No, I'm not suggesting to change type for the value field. But long
value should also fit in int64_t. I'm just not sure that inlined
PyLong_AsInt64 is a fastest solution.
|
There is always a way to make this work though like: #if sizeof(int64_t) == sizeof(long)
|
Here new benchmarks for PyLong_Export with updated gmpy2 implementation (follows PEP). Current implementation:
With PyUnstable_Long_CompactValue: int
PyLong_Export(PyObject *obj, PyLongExport *export_long)
{
if (!PyLong_Check(obj)) {
PyErr_Format(PyExc_TypeError, "expect int, got %T", obj);
return -1;
}
PyLongObject *self = (PyLongObject*)obj;
if (PyUnstable_Long_IsCompact(self)) {
export_long->value = (int64_t)PyUnstable_Long_CompactValue(self);
export_long->negative = 0;
export_long->ndigits = 0;
export_long->digits = 0;
export_long->_reserved = 0;
}
else {
export_long->value = 0;
export_long->negative = _PyLong_IsNegative(self);
export_long->ndigits = _PyLong_DigitCount(self);
if (export_long->ndigits == 0) {
export_long->ndigits = 1;
}
export_long->digits = self->long_value.ob_digit;
export_long->_reserved = (Py_uintptr_t)Py_NewRef(obj);
}
return 0;
}
With PyLong_AsLongAndOverflow: int
PyLong_Export(PyObject *obj, PyLongExport *export_long)
{
if (!PyLong_Check(obj)) {
PyErr_Format(PyExc_TypeError, "expect int, got %T", obj);
return -1;
}
PyLongObject *self = (PyLongObject*)obj;
int overflow;
long value = PyLong_AsLongAndOverflow(obj, &overflow);
if (!overflow) {
export_long->value = (int64_t)value;
export_long->negative = 0;
export_long->ndigits = 0;
export_long->digits = 0;
export_long->_reserved = 0;
}
else {
export_long->value = 0;
export_long->negative = _PyLong_IsNegative(self);
export_long->ndigits = _PyLong_DigitCount(self);
if (export_long->ndigits == 0) {
export_long->ndigits = 1;
}
export_long->digits = self->long_value.ob_digit;
export_long->_reserved = (Py_uintptr_t)Py_NewRef(obj);
}
return 0;
}
Probably, last one seems to be fastest. Current implementation for PyLong_Export() also hardest to port. |
I think this should fix warnings.
This reverts commit 5d3e224.
Co-authored-by: Sergey B Kirpichev <[email protected]>
Add PyLong_Export() and PyLong_Import() functions and PyLong_LAYOUT structure.
📚 Documentation preview 📚: https://cpython-previews--121339.org.readthedocs.build/