-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do we mark the difference between a string value, and an Object or Array represented as JSON #2
Comments
To be clear, I don't think this is a blocker, just something to think about. We can get around this ambiguity but providing:
Or maybe just making that the signature of |
I think I have a solution for this using unions... |
Yes, I think this is likely the only way to go -- Snowflake uses a |
BTW I am not sure how mature |
BTW 2 I think @WenyXu has some other ideas here: apache/datafusion#7845 (comment) |
I think unions solve this provided we can find a solution to apache/datafusion#10180. |
This is solved mostly by rewriting the query. |
@alamb As you'll see I've started work in #1 and pydantic/jiter#84.
But I've realised we might need some to differentiate between nested Arrays and Objects, represented as strings, and JSON strings.
Consider the following cases:
json_get('{"foo": "bar"}', 'foo') -> 'bar'
json_get('{"foo": [1, 2, 3]}', 'foo') -> '[1, 2, 3]'
The returned values represent very different things, but unless introduce some new type, would both be represented as strings.
Even worse:
json_get('{"foo": "[1, 2, 3]"}', 'foo') -> '[1, 2, 3]'
- here the return value exactly matches the case above, even though the JSON is differentThe main case where this becomes problematic is when you want to do:
Clearly the simplest solution is some kind of
JSON
marker type, but I've no idea how hard this is to define within datafusion?The text was updated successfully, but these errors were encountered: