-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add trailing_commas
to dump
/dumps
#372
Conversation
Add an optional boolean argument `trailing_commas` to the `dump` and `dumps` methods in `simpleion`. Default: False. When `trailing_commas=True`, and when `indent` is not `None`, this causes `dump[s]` to include a comma on the last item in a container. **Note:** Since the C module already **does not support pretty printing**, this flag does nothing when using the C module.
Make the `trailing_commas` flag do what it says: Only apply when pretty printing (`indent is not None`). Fix all remaining `test_simpleion.py` failures.
@@ -121,11 +121,15 @@ def dump(obj, fp, imports=None, binary=True, sequence_as_stream=False, indent=No | |||
indent (Str): If binary is False and indent is a string, then members of containers will be pretty-printed with | |||
a newline followed by that string repeated for each level of nesting. None (the default) selects the most | |||
compact representation without any newlines. Example: to indent with four spaces per level of nesting, | |||
use ``' '``. | |||
use ``' '``. Supported only in the pure Python implementation (because pretty printing is not yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np!
@@ -369,7 +369,7 @@ def _raw_writer_coroutine(depth=0, container_event=None, whence=None, indent=Non | |||
ion_event, self = (yield transition) | |||
delegate = self | |||
|
|||
if has_written_values and not ion_event.event_type.ends_container: | |||
if has_written_values and ((indent and trailing_commas) or not ion_event.event_type.ends_container): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be wrong but I think this will also add a space after the last item in an s-exp: (foo bar bar )
I'm not sure that's bad but I don't think it's intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I did think you were right, but now I actually think not, because there's a separate set of delimiters defined for use when pretty printing:
_CONTAINER_DELIMITER_MAP_PRETTY = {
IonType.STRUCT: b',',
IonType.LIST: b',',
IonType.SEXP: b'', # we use newlines when pretty printing
}
which uses an empty string as the delimiter for s-expressions, since they add newlines.
I could test this to make sure, but I'm pretty sure.
If we do want to apply this to non-pretty-printing, then we could do that, but then your concern expressed here would need an additional work-around, or there would indeed be an extra space.
I think the solution, if needed, would be to fork:
_CONTAINER_DELIMITER_MAP_NORMAL = {
IonType.STRUCT: b',',
IonType.LIST: b',',
IonType.SEXP: b' ',
}
into:
_CONTAINER_DELIMITER_MAP_NORMAL = {
IonType.STRUCT: b',',
IonType.LIST: b',',
IonType.SEXP: b' ',
}
_CONTAINER_END_DELIMITER_MAP_NORMAL = {
IonType.STRUCT: b',',
IonType.LIST: b',',
IonType.SEXP: b'',
}
and use _CONTAINER_END_DELIMITER_MAP_NORMAL
for the final trailing delimiter.
tuple_as_sexp (Optional[True|False]): When True, all tuple values will be written as Ion s-expressions. | ||
When False, all tuple values will be written as Ion lists. Default: False. | ||
omit_version_marker (Optional[True|False]): If binary is False and omit_version_marker is True, omits the | ||
Ion Version Marker ($ion_1_0) from the output. Default: False. | ||
trailing_commas (Optional[True|False]): If binary is False and pretty printing (indent is not None), includes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: it may actually be clearer to a user to make it independent of indent
I know you don't intend to use it that way, and TBH I don't know the value of doing so, but it would be avoid the need to explain and users understand the dependent relationship, or test for it in the code. 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a thought.
I actually had it accidentally including trailing commas in non-pretty-printed output for the first revision, so it would be easy to put back to that (except for the problem explained in the next paragraph). I guess if you say you don't want indent but you do want trailing commas, there's no strong reason not to.
However ... there is one reason not to. This is a reason why I'd rather that be a feature we could consider but not try to include here. That reason is that the Python C module implementation already doesn't support pretty-printing, so I was able to leave it out of scope. If we make trailing_commas
applicable to non-pretty-printed output, then I have to go and do all the C module implementation.
I'd rather not hold this PR up to implement non-pretty-printed trailing-comma output in the C module; especially since I'm not sure we're likely to see any strong demand for trailing commas in non-pretty-printed output. I guess we could make it not depend on indent
, but still disclaim it as "not supported in the C implementation". So that's an option.
Ouch, that's a lot of failed checks. :/ I guess I'll have to have a look into those at some point. So I don't go on any wild goose chases, do we normally expect all of those checks to pass? [UPDATE: Looking more, most of them are segfaults and stuff that don't look in any way potentially related to this PR, so I suspect this is (sadly) the status quo, and not something caused by this code change.] |
amazon/ion/simpleion.py
Outdated
"""C-extension implementation. Users should prefer to call ``dump``.""" | ||
|
||
res = ionc.ionc_write(obj, binary, sequence_as_stream, tuple_as_sexp) | ||
res = ionc.ionc_write(obj, binary, sequence_as_stream, tuple_as_sexp, trailing_commas) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This call is what is causing the segfault in the unit tests. ionc_write
isn't expecting to get the trailing_commas argument, and is not handling it well.
If you want to put in a # TODO:
, and drop the trailing_commas
from the ionc.ionc_write
call, I can dig into why it isn't handling that a bit more gracefully, and that should clear up the test errors (the ones that aren't already explained).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! I meant to omit the argument from the call to ionc...
I'll fix that and post a new rev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I must be confused. I'm not passing the arg to the C code, as far as I can see?
(Note lack of trailing_commas=trailing_commas
in the first branch.)
if c_ext and __IS_C_EXTENSION_SUPPORTED and (imports is None and indent is None):
return dump_extension(obj, fp, binary=binary, sequence_as_stream=sequence_as_stream,
tuple_as_sexp=tuple_as_sexp, omit_version_marker=omit_version_marker)
else:
return dump_python(obj, fp, imports=imports, binary=binary, sequence_as_stream=sequence_as_stream,
indent=indent,
tuple_as_sexp=tuple_as_sexp, omit_version_marker=omit_version_marker,
trailing_commas=trailing_commas)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, oops, in dump_extension
. D'oh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
(Removed trailing_commas
from dump_extension()
and added a TODO
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
Issue #, if available:
Issue # 370
Description of changes:
Add an optional boolean argument
trailing_commas
to thedump
anddumps
methods insimpleion
. Default: False.When
trailing_commas=True
, and whenindent
is notNone
, this causesdump[s]
to include a comma on the last item in a container.Note: Since the C module already does not support pretty printing, this flag does nothing when using the C module.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.