Skip to content
This repository has been archived by the owner on Jan 22, 2019. It is now read-only.

Per [#106] added allowNullFields to provide the option of processing… #107

Closed

Conversation

morbrian
Copy link

@morbrian morbrian commented Jan 1, 2016

Fix #106 null elements of lists as empty fields or the nullCharacter during serialization instead of the default behavior which skips null elements.

…ocessing null elements of lists as empty fields or the nullCharacter during serialization instead of the default behavior which skips null elements.
@cowtowncoder
Copy link
Member

Thank you for the improvement here -- I will have to spend bit more time to fully understand it, but I think it looks like a good thing to add.

In the meantime: if I have not yet asked for the Contributor License Agreement (CLA), from:

https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf

I would need one before merging the first contribution; one is enough for all future contributions.
The usual way is to print it, fill, sign, scan and email to info at fasterxml dot com; once I get it I can merge the PR. It's the only bit of paperwork we need fortunately.

@morbrian
Copy link
Author

morbrian commented Jan 2, 2016

Thank you for looking at it. The paperwork is no problem.

@cowtowncoder
Copy link
Member

One question here: it seems like there are two different cases; "root nulls", where the main-level value is null; and "field nulls", where object property is null. Well, actually I guess there could be third one for nulls within array values (within field), but I'll ignore that for now.

Should handling between these differ? It seems like the second case -- null as property value -- should be enabled by default (unless I misunderstand the effects) or at least vary independently from the question of whether to ignore root-level nulls?

If so, it would seem that perhaps field-null handling should relate to schema (as it is per-type), whereas root-null handling should perhaps be a CsvGenerator.Feature?

I am just thinking out aloud here and it is possible I miss something, or misunderstood things, so feel free to correct me if so. :)

I'll keep on reading code and especially test case to have bit more understanding. And thanks for the CLA, received it.

@morbrian
Copy link
Author

morbrian commented Jan 2, 2016

Those are good points. I hope this response isn’t too long-winded, but I am rethinking allowNullFields now and wanted to provide a good explanation.

Your breakdown of the problem into two different cases provided helpful insight, and I am thinking hard about this point in particular regarding the second case of field nulls:

It seems like the second case -- null as property value -- should be enabled by default

Here is a quick code snippet to help clarify the discussion:

List recordList = Arrays.asList(rowItem1, rowItem2, rowItem3);

I am thinking of rowItem objects interchangeably with main-level objects. I am also thinking of fields as any field or element of an object that becomes a field of the CSV output.

Below I describe two sub-cases of the case you named field-nulls.

Sub-Case 1: Field Nulls on rowItem POJO

When the rowItems are POJOs with properties, null handling of fields was already working correctly, before my proposed modification. In fact, the proposed allowNullFields setting currently has no effect on POJOs with properties.

Sub-Case 2: Field Nulls on rowItem Lists

When the rowItems are Lists, then the items of theses lists become the fields of the row upon serialization. The current default behavior skips over null items of a list. For use-cases where the fields are equivalent to columns, then skipping over a null during serialization is undesirable behavior. This motivated my proposed allowNullFields.

Consistent Default Behavior

I suppose right now the behavior when serializing a field-null is inconsistent because POJO property null values result in empty fields, while List element null values are skipped.

To be consistent, the List element null values should not be skipped by default for elements of rowItem lists.

Main-Level Null Case

Referring back to the recordList code snippet, I agree null main-level rowItems should be skipped by default, and I do not think that behavior needs to change.

Unfortunately, my proposed allowNullFields changes that behavior when enabled, and I think that is probably bad.

Achieving the desired behavior can be ambiguous in some cases. For example, when a writer is asked to serialize a list, I do not think it knows whether it should treat the list as a recordList (i.e. collection of rowItems) or whether it should treat the list as a single rowItem.

Next Steps

I should probably close this PR and rework the code so that:

  1. By default, serializing a rowItem does not skip writing null fields or elements when rowItem is a list.
  2. The proposed default behavior of allowing nulls when serializing rowItem lists should not effect how nulls at the main-level record-list are handled.

If you generally agree, I will go ahead and close this PR and reopen after I figure out a solution that meets the above goals.

@morbrian
Copy link
Author

morbrian commented Jan 3, 2016

I revisited the code with our conversation in mind and found a simpler solution that only impacts the CsvGenerator. I will close this PR and open a new one with just that commit.

@morbrian morbrian closed this Jan 3, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants