Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for large decimals with >34 digits. #293

Merged
merged 3 commits into from
Oct 19, 2023

Conversation

nirosys
Copy link
Contributor

@nirosys nirosys commented Oct 18, 2023

Issue #, if available: #159

Description of changes:
This PR fixes the issue identified in #159 where no more than 34 digits were supported in ion decimals.

There were 2 issues in this PR that should be made aware for anyone working with ion-c and decNumber.

The Decimal Context Needs to Allow More Digits

Originally, ion-python's C extension initialized the decContext used for configuring the decNumber library with one of the available presets, DEC_INIT_DECQUAD. This preset configures the library with the expectation that you'll be using decQuads, which are limited to 128bit, or 34 decimal digits. This configuration lead to ion-c returning IERR_NUMERIC_OVERFLOW when trying to load a larger than 34 digit decimal:

>> import amazon.ion.simpleion as ion
>>> ion.loads("99999999999999999999999999999999999999999999999999999999999999999999999999.999999999999999999999999")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/ion-python/amazon/ion/simpleion.py", line 487, in loads
    return load(ion_buffer, catalog=catalog, single_value=single_value, encoding=encoding, cls=cls,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ion-python/amazon/ion/simpleion.py", line 556, in load
    return load_extension(fp, parse_eagerly=parse_eagerly, single_value=single_value,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ion-python/amazon/ion/simpleion.py", line 519, in load_extension
    value = next(iterator)
            ^^^^^^^^^^^^^^
amazon.ion.exceptions.IonException: IERR_NUMERIC_OVERFLOW 
>>> 

When creating the decContext, we can use the digits field to specify how many digits we would like to support before returning overflow errors, or when rounding is required. In this PR I've set that field to 10,000.

decNumbers do not have a well defined size

The definition of decNumber defines the lsu field like this:

typedef struct {
   // .. other stuff not important atm.
   decNumberUnit lsu[DECNUMUNITS];
} decNumber;

DECNUMUNITS is the number of units required to store DECNUMDIGITS digits. We define DECNUMDIGITS at build time, so this struct has a well defined size. The compiler recognizes the size, and expects that every decNumber will be this size. Unfortunately for the compiler, and us, this is not true.

The decNumber library, despite defining a known size, does not consider the size of lsu to be static. If more memory is needed to hold more digits, a larger buffer can be allocated for the decNumber and associated operations know to go beyond DECNUMUNITS units to access more digits.

Because of this an operation as innocent as:

read_number = *(decimal_value.value.num_value);

which the compiler treats as a memcpy, using the struct's size, has the potential to stop short of copying all of the digit data. This results in the decNumber library poke around memory that is beyond the size of the struct:

>>> import amazon.ion.simpleion as ion
>>> ion.loads("99999999999999999999999999999999999999999999999999999999999999999999999999.999999999999999999999999")
Decimal('936521992000000000000000764479192000559583245000000999999999999999.999999999999999999999999')
>>> 

This PR removes the copy, and just refers to the decNumber through a pointer. In the case of the decimal being a decQuad, a new decNumber is allocated to receive the converted decQuad. The memory is allocated with the default size of the decNumber since the maximum size of a decQuad is 34 digits, which is what we define for DECNUMDIGITS.

With this change, we no longer get garbage digits beyond the default struct size:

>>> import amazon.ion.simpleion as ion
>>> ion.loads("99999999999999999999999999999999999999999999999999999999999999999999999999.999999999999999999999999")
Decimal('99999999999999999999999999999999999999999999999999999999999999999999999999.999999999999999999999999')
>>> 

An important callout here is that when dealing with decNumbers, unless we know for sure that the value will not exceed the default number of digits, we should be allocating buffers for them on the heap, and not trying to work with stack allocated structs.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

In order to support 3rd party repos it looks like we can either leave both the
revision, and the repo off of the config, or specify both. By not specifying
the repo, but specifying the ref/revision, the checkout action will try to pull
the revision from the target repo.
@nirosys
Copy link
Contributor Author

nirosys commented Oct 19, 2023

Updated the performance regression workflow to update the checkout of PR's source to include the repo. The repo defaults to the target repo if you fill in a rev/ref, which ends up breaking if the PR is made from a fork.

After a successful run, 3 of the 24 perf regression tests failed, with wild numbers (+30-40%), where the passing tests do not reach anywhere near as high. I re-ran the failed tests expecting the spike to be environmental, and 2 of the 3 failed tests succeeded this time showing 1-2% differences vs the original 30-40%, which seems to confirm it is environmental. Going to merge.

@nirosys nirosys marked this pull request as ready for review October 19, 2023 22:49
@nirosys nirosys merged commit fcffdef into amazon-ion:master Oct 19, 2023
32 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants