-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decimal Support for Binary Precision #91
Comments
I made a PR attempting to add rudimentary support for Decimal fields that are represented by byte arrays, which may have precision over 18. |
Problem ======= Address #91 Solution ======== When encountering such byte array represented "Decimal" fields, parse them into raw buffers. Change summary: --------------- - Added code to parse "Decimal" type fields represented by byte arrays (fixed length or non-fixed length) into raw buffer values for further client side processing. - Added two test cases verifying the added code. - Loosen the precision check to allow values greater than 18 for byte array represented "Decimal" fields. Steps to Verify: ---------------- - Use the library to open a parquet file which contains a "Decimal" field represented by a byte array whose precision is greater than 18. - Before the change, library will throw an error saying precision cannot be greater than 18. - After the change, library will parse those fields to their raw buffer values and return records normally. --------- Co-authored-by: Wil Wade <[email protected]>
I suspect that the earlier pull request has caused some regression issues related to
From what I can gather, this occurs even if there are no |
@craxal the fix from @JasonYeMSFT released in v1.6.1 (just this morning) should fix it. |
@wilwade Ah, yes, I think it does. Just tested it myself. Sorry, I thought the pull request had already been released. |
Is there any status update on this item? We're hoping we can start parsing fixed length array decimals in the near future. |
Currently this library only supports DECIMAL reading and writing when the precision is <= 18
To annotate the Parquet Spec: https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal
int32
: for 1 <= precision <= 9int64
: for 1 <= precision <= 18; precision < 10 will produce awarning
fixed_len_byte_array
: precision is limited by the array size. Lengthn
can store <=
floor(log_10(2^(8*n - 1) - 1))
base-10 digitsbinary
:precision
is not limited, but is required. The minimum number ofbytes to store the unscaled value should be used.
Test Files:
Related Issues:
The text was updated successfully, but these errors were encountered: