-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
broken link to A.Neumaier article in built-in sum comment #111933
Comments
FYI, citation: Neumaier, A. (1974), Rundungsfehleranalyse einiger Verfahren zur Summation endlicher Summen. Z. angew. Math. Mech., 54: 39-51. https://doi.org/10.1002/zamm.19740540106. Full text is still available in the archive: https://web.archive.org/web/20220804051351/https://www.mat.univie.ac.at/~neum/scan/01.pdf Should we use this url or above citation (no free text access)? BTW, I doubt there is an issue, besides a broken link. Associativity is broken for addition of floating point numbers, see e.g. TAOCP, Vol. 2, 4.2.2. The Tutorial has a dedicated section for floating point arithmetic, which cover this "issue".
The decimal module actually not about exact arithmetics, rather the fractions module. |
cc @rhettinger |
One might naively expect sum(a) to be .9 -.6 = .3. But it is actually 0.29999999999999993, so the error is about 7e-17. The sum(a2) is about 3e-17, so the absolute error has gone down. 0.0 error is something of an accident. Now add -.3 to list @enzbus In the future, please first ask about possible bugs on help forums such as https://discuss.python.org/c/users/7. And please omit disparaging comments wherever you post. |
You also don't have to modify the example too much to see a case where the accuracy of the sum function improved with the change:
|
I'm actually getting the error on random inputs, see the commit above which references this bug report. The point remains, this is a regression, adds little value (as I explain above), is the implementation of a 1974 paper in german (really?) which is not online any more, and should not have happened. |
@terryjreedy please re-open, at least to see if this affects other users. Please note that there were no disparaging comments. I noted that the Standard Library already provides arbitrary precision arithmetic in the decimal module, this change refers a suspicious article in German which I couldn't even find (and I went on the academic webpage of the researcher). I think this should not have passed quality review, since this bug is real (I provided a minimal code to show it, but I encountered it on random inputs where the comparison is exact on python < 3.12), especially since it adds little value. |
@terryjreedy, lets not forgot the real issue with the broken link, see PR #111937. |
Everyone who uses non-integral binary floats in any language, not just Python, is affected by the quirks of decimal-binary conversions and of floating-point arithmetic. Comparing the results with @enzbus Suggesting that most or all of us core developers do not care about code quality is not a compliment. @skirpichev You are right. Reopening for the doc fix. |
Since the minimal code I provided above is not good enough, here's a more interesting version import random
random.seed(0)
a = [random.gauss(mu=0.0, sigma=1.0) for i in range(10)]
sum_a = sum(a)
sum(a + [-sum_a]) == 0 This is False on 3.12, and True on all earlier versions. That's a bug in my opinion, and it's not justified by improved accuracy in some extreme cases. Also, a 1974 paper in German from a researcher who doesn't even upload it on his academic webpage is not good enough for a fundamental code change, and I don't think it have should passed quality control. |
Did you not read the researcher's page? It says the site moved and provides the new address: https://arnold-neumaier.at/ And the paper is there as well: https://arnold-neumaier.at/scan/01.pdf If I prepend the sum, your last example fails on 3.11 (I can't test 3.12): import random
random.seed(0)
a = [random.gauss(mu=0.0, sigma=1.0) for i in range(10)]
sum_a = sum(a)
print(sum([-sum_a] + a) == 0) # prints False |
Yes, and it also says that he moved out his website because the University doesn't approve of its content, which doesn't suggest it's particularly trustworthy.
Your argument is invalid, I can give much simpler examples of inexact floating point arithmetic. I gave you instead multiple examples of arithmetic that was correct on all cpythons < 3.12 and is broken on 3.12. (You can try with different random seeds, 3.12 is broken ~95% of the times and earlier pythons are always correct). The burden is on whoever proposed and accepted this change to show that a 1e-17 improvement on some examples is worth breaking existing codes. And a 1974 paper in german on some guy's website is not a good exhibit. By the way, the examples I brought work on any implementation of sum that don't fiddle with dynamic programming (are not state-dependent). This is a fundamental change of a builtin function used throughout the library, and I'm under the impression that it's a vanity project that shouldn't have gotten past some corner of a little-used part of the standard library. |
I merely used your example and just prepended instead of appending. I can give simpler examples, too. For example this fails in 3.11: a = [0.1, 0.2]
print(sum([-sum(a)] + a) == 0) I had actually tried to add that to my earlier comment but got an error message trying to edit it. Also can't add new comments with my other account. Did you block me?
Where? I don't see it. Looks like you made that up. |
It says the rectorate doesn't allow him to link his personal web page, and on the bottom that he moved it for political reasons. I'm not interested in arguing about the 1974 german paper, just in making sure that low-level code changes of very dubious value that break existing code are not inserted without proper review. And this one should be reverted ASAP. Again, I stress that the examples I brought are code that worked on cpython before 3.12, and are broken on 3.12. If you want to give me examples of code broken on 3.11, I don't have time to hear that. |
The value isn't dubious. Numpy has been using a high accuracy In general, having more accumulated rounding error is never desirable except in the case of trying reproduce a less accurate result. If the latter is needed, it is trivially easy to accumulate a total in a loop or to use a functional variant like FWIW, the German paper can be viewed in other languages using Google Translate. The underlying logic is now called Fast2Sum and has been well studied. It is the foundation for many high accuracy algorithms. |
Ok @rhettinger , I believe that this new implementation has lower numerical error on average, and makes exact some arithmetic expressions that were inexact before. However it is a breaking change, and so far nobody in this thread has acknowledged it (and I guess many are core devs?). I gave you examples of floating point arithmetic code that were exact on all |
For the purpose of this tracker, a 'bug' is a discrepancy between code and doc. For Issue #100425 involved 6 core developers who extensively discussed and tested the proposed change. Few issues get as much attention. As near as I can tell, they all considered the change to be an overall improvement. A few carefully picked examples will not change that opinion. In any case, 3.12 has been released. Reverting back would be a breaking change again. |
cPython
3.12 implementation of sum
is not associative on **simple inputs**sum
is not associative on **simple inputs**
sum
is not associative on **simple inputs**sum
is not associative on simple inputs
It is not a carefully picked example, but an actual bug that I encountered on an open-source library I maintain (an |
sum
is not associative on simple inputssum
breaks compatibility; it inserts errors on expressions that were exact on all previous cPython versions
It wasn't. Consider your original example: >>> a = [0.1, -0.2, 0.3, -0.4, 0.5]
>>> a.append(-sum(a)) The exact values stored in
And the exact sum is:
Not zero. (And please stop blocking me.) |
You are really nitpicking, and it's not helping your case. If an |
sum
breaks compatibility; it inserts errors on expressions that were exact on all previous cPython versionssum
breaks compatibility; it inserts errors on expressions that were correct on all previous CPython versions
In fact, we have decimal module to demonstrate your argument (current main branch): >>> from decimal import Decimal
>>> from functools import reduce
>>> from operator import add
>>> a = [0.1, -0.2, 0.3, -0.4, 0.5]
>>> ad = list(map(Decimal, a))
>>> Decimal(float(reduce(add, ad)))
Decimal('0.29999999999999993338661852249060757458209991455078125')
>>> float(reduce(add, ad + [-Decimal(float(reduce(add, ad)))]))
2.77555756156e-17
>>> sd = _
>>> sum(a + [-sum(a)])
2.7755575615628914e-17
>>> sr = _
>>> abs((sr - sd)/(0.0 - sd))
1.0417222640068697e-12 In fact, new sum() here more accurate than naive summation for several orders of magnitude.
@pochmann, you are not alone. (Is he already blocking core developers or not yet?)
@enzbus, people already told you, that using |
@kirpichevs Yes, I used |
Well, with this precision there is still a difference with sum() v3.12. But with prec=99: >>> getcontext().prec=99
>>> a = [0.1, -0.2, 0.3, -0.4, 0.5]
>>> ad = list(map(Decimal, a))
>>> Decimal(float(reduce(add, ad)))
Decimal('0.29999999999999993338661852249060757458209991455078125')
>>> float(reduce(add, ad + [-Decimal(float(reduce(add, ad)))]))
2.7755575615628914e-17
>>> sd = _
>>> sum(a + [-sum(a)])
2.7755575615628914e-17
>>> _ - sd
0.0 |
You are really trying to shift blame here, away from having introduced a breaking change. If you don't like my code, that's fine, but I test it across all supported CPython versions, and when I added 3.12 I had to go and find why an expression that was exact became inexact. It took me a while because I would never have thought CPython was to blame. But now I understand, I think I see a cavalier attitude among some core devs on introducing breaking changes. |
I don't think anyone is denying it's a breaking change. It clearly was. However, it's a breaking change we were comfortable making in a effort to improve Python. Every release of CPython has breaking changes, in at least the sense that the changes can be detected somehow. Which is unfortunate, but such is the cost of progress. I'm sorry you were affected by this one. But I agree with others that relying on There's no appetite here for reverting this change. |
Could you, please, explain (you can still block people here, enjoy!) why do you think it "was exact"? Consider a simple list of floats: >>> a = [0.1, -0.2, 0.3, -0.4, 0.2] Now we try to compute sum: >>> from random import sample
>>> from functools import reduce
>>> from operator import add
>>> reduce(add, sample(a, 5))
0.0
>>> reduce(add, sample(a, 5))
-8.326672684688674e-17
>>> reduce(add, sample(a, 5))
0.0
>>> reduce(add, sample(a, 5))
-2.7755575615628914e-17 Oops. Different results! Which answer do you prefer and why? PS:
As it was pointed from the beginning of the thread, the link was not dead at least till August 2022. Your argument is wrong. |
Ok, apologies accepted. Thanks for making CPython available to the community. |
Let's make it's open until PR with fix(inaccessible link to the research) get merged. |
We are not going to revert the change to sum. We are going to improve the reference. No more discussion needed. |
That's now done. |
sum
breaks compatibility; it inserts errors on expressions that were correct on all previous CPython versionssum
implementation
sum
implementation
Bug report
Bug description:
The new implementation of
sum
on Python 3.12 (cfr. #100425 , #100426 , #107785 ) is not associative on simple input values. This minimal code shows the bug:On Python 3.11:
On Python 3.12:
I'm sure this affects more users than the "improved numerical accuracy" on badly scaled input data which most users don't ever deal with, and for which exact arithmetic is already available in the Standard Library
-> https://docs.python.org/3/library/decimal.html.
I'm surprised this low-level change was accepted with so little review. There are other red flags connected with this change:
cPython
's official code is dead ->cpython/Python/bltinmodule.c
Line 2614 in 289af86
Is anybody interested in keeping the quality of
cPython
's codebase high? When I learned Python, I remember one of the first thing in the official tutorial was that Python is a handy calculator, and now to me it seems broken. @gvanrossum ?CPython versions tested on:
3.12
Operating systems tested on:
Linux
Linked PRs
The text was updated successfully, but these errors were encountered: