Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StreamReadCapability.EXACT_FLOATS to indicate whether parser can expose exact floating-point values or not #733

Merged
merged 6 commits into from
Jan 3, 2022

Conversation

htmldoug
Copy link
Contributor

@htmldoug htmldoug commented Dec 20, 2021

Adds StreamReadCapability.EXACT_FLOATS as suggested in #730 (comment).

@htmldoug htmldoug force-pushed the 2.14-exact-floats branch 3 times, most recently from 7e2c173 to db76643 Compare December 20, 2021 03:01
@htmldoug htmldoug changed the title Fix precision loss for JSON non-doubles. Fix precision loss for JSON non-doubles. (StreamReadCapability.EXACT_FLOATS) Dec 20, 2021
@cowtowncoder
Copy link
Member

Hmmh. I'll have to think about this some more; although the "yes, got exact value" part makes sense, the part about copying numbers as Strings is trickier (it might also make sense to just use getText() as backing parser may or may not have char[] representation but I digress).

My specific concern is whether logic makes sense for all textual backends -- YAML, for example, exposes all kinds of exotic formats -- and then re-parsing might not work by databind, nor possible cases of copying from YAMLParser into any non-YAML generator implementation.
Actually even with JSON this would break if copying from JSON to Smile, I think, since writeNumber(String) (and variants) is NOT supported there.

But one way we could also fix the JSON part, I think, might be to selectively override functionality in JsonGeneratorImpl (intermediate base type only used by JSON package, as opposed to JsonGenerator everyone uses).
Or... hmmh. Something along these lines.

Further, it probably would at least make sense to further refactor NumberType.FLOAT copy handling into separate, overridable method. This would be the first part and would also reduce duplication slightly.

Going back to the original problem of only some back-ends allowing textual number writes... there actually IS another matching JsonGenerator capability:

  • StreamWriteCapability.CAN_WRITE_FORMATTED_NUMBERS

which should help figure out if writing works.

So maybe logic would check...

  1. If parser can provide exact numbers, use original mechanism (since detection works etc)
  2. If generator DOES support CAN_WRITE_FORMATTED_NUMBERS then we could use String copy
  3. Otherwise need to decide whether to use old logic (with possible loss) or safe "use BigDecimal"?

I am not quite sure how exactly this should work, I guess -- does YAML allow exotic double presentations (for integers there's the whole menagerie of base-N variants, but for FPs?) -- but I hope above gives some more ideas.

One thing I'd be happy to do first, though, would be to do one PR only adding the new StreamReadCapability -- I think that is legit even regardless of how it is used to solve issues? So if you can separate that, I could merge that PR against 2.14 first, and you could perhaps try out cross-format cases (YAML and Smile modules, f.ex, may depend on core JSON module (because it's in jackson-core); but not the other way around).

@cowtowncoder
Copy link
Member

Oh, one other thing: @htmldoug have I asked for CLA yet? If not, that's one thing needed before I can merge a PR.
1-page doc is here:

https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf

and the usual way is to print it, fill & sign, scan/photo, email to info at fasterxml dot com; and once I receive it, store, it's good for any and all contributions for all Jackson projects.

Looking forward to getting the new capability merged!

@cowtowncoder
Copy link
Member

CLA received.

So, at this point, I'd like to get a smaller PR for the new capability, get that (those, for dataformats) merged in, and then consider how to implement actual logic.

@htmldoug
Copy link
Contributor Author

htmldoug commented Dec 29, 2021

Thanks for the prompt CLA receipt confirmation and thoughtful replies.

Descoped as requested. Would you like for me to include the failing copyCurrentStructure tests with an @Ignore in an existing suite (which?) or leave them out for now?

FWIW, the new StreamReadCapability is actually all I need to unblock me. Avoiding copyCurrentStructure entirely is a viable solution for my usage. (All of my JsonGenerator equivalents must handle stringly numbers.)

Thanks for the heads up on StreamWriteCapability.CAN_WRITE_FORMATTED_NUMBERS. I wasn't aware I needed to check that! SmileParser.writeNumber(String) looks supported to me, although it reports StreamWriteCapability.CAN_WRITE_FORMATTED_NUMBERS as false. A quick test of "1" and "1.1" shows SmileGenerator successfully writing them as small int and bigdecimal smile types respectively.

I don't see any dataformat PRs tagged to #630, so maybe flag usage just hasn't been rolled out yet. Sounds we could use an audit so we'll be able to take advantage of this someday. Are there any shared compliance tests where we could add something like: if (CAN_WRITE_FORMATTED_NUMBERS) then roundtrip "3.14" through a SmileGenerator/Parser pair? Or alternatively, is there a single place that depends on all the generators/parsers where we could test them in some common way? WDYT is the best way to build confidence in the results of getRead/WriteCapabilities() across the ecosystem?

@htmldoug htmldoug changed the title Fix precision loss for JSON non-doubles. (StreamReadCapability.EXACT_FLOATS) Add StreamReadCapability.EXACT_FLOATS to signal potential precision loss for JSON non-doubles. Dec 29, 2021
@cowtowncoder
Copy link
Member

@htmldoug New tests would be great -- I put those under "src/test/java/..../failing" instead of @Ignore.

I hope to look into this tomorrow or day after, and get this one merged; will respond to other questions too.
Just need to get through new year's stuff before OSS stuff :)

Happy New Year!

@htmldoug
Copy link
Contributor Author

Cool. I've moved the copyCurrentEvent/copyCurrentStructure test over to /failing.

To be clear, I've got a workaround in place, so I'm good for now. No rush! A fix would let me remove that workaround, but there's no urgency from my end. I'm just here to help out and give back if I can.

Happy new year!

@cowtowncoder cowtowncoder merged commit e655e46 into FasterXML:2.14 Jan 3, 2022
@cowtowncoder cowtowncoder changed the title Add StreamReadCapability.EXACT_FLOATS to signal potential precision loss for JSON non-doubles. Add StreamReadCapability.EXACT_FLOATS to indicate whether parser can expose exact floating-point values or not Jan 3, 2022
@cowtowncoder cowtowncoder added this to the 2.14.0 milestone Jan 3, 2022
@cowtowncoder
Copy link
Member

Merged.

@cowtowncoder
Copy link
Member

Defaulting for the new capability introspection now added to binary format backends (ones in jackson-dataformats-binary -- msgpack and bson would require separate additions).

cowtowncoder added a commit that referenced this pull request Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants