Skip to content

Array streaming and writer API element tracking #112

@arnetheduck

Description

@arnetheduck

In the current version of the API, arrays are created using stepwiseArrayCreation which requires a collection. Plumbing like , and indent is handled by the iterator but there is no way to create an array hand have the writer manage plumbing while writing elements one by one.

For objects, a somewhat haphazard combination of things happen to make the plumbing work - writeName injects , while "field end" tracking is done arbitrarily - writeField is a special version of writeValue that does it in some cases but it cannot be used for nested objects and arrays - hence the library contains additional helpers fieldWritten and endRecordField to patch up the state after writing a field through other means than writeField.

Objects can be written in two ways: either by passing a full object to writeValue or by using beginRecord and endRecord/endRecordField. The latter method allows streaming fields one by one without an instance and is used in writeValue overloads to provide custom formatting for types. Depending on whether you're writing a "top-level" record or nested, you would use endRecord and endRecordField respectively.

To allow streaming array elements without first constructing a collection, we need the equivalent of beginRecord and endRecord, but for arrays. Objects have a natural place to perform pre-field plumbing work - writeName - but for arrays there is no such requirement. Objects and arrays alike have no natural home for post-element plumbing.

To resolve this issue, there are two main approaches:

  • introduce a rich set of plumbing functions - beginArray for top-level, beginArrayElement for starting an array inside an array, beginArrayMember for starting an array inside an object - ditto for writeValue which in addition to writeField (which writes values inside an object), we add writeElement which writes values inside arrays. The user then has to make sure that they call the right function depending on the context in which they are calling it - top-level, inside array or inside object.
  • make writeValue responsible for the plumbing - this makes it significantly easier to track plumbing since every value written can perform the "begin" and "end" work. Writing nested arrays and objects becomes trivial as the framework keeps track of the nesting and value-writing naturally provides the writer with enough information to inject commas and indent. This however is a significant departure from the current API because when writing custom writeValue overrides, there is no such begin/end requirement - "forgetting" to add begin/end would result in invalid JSON being output meaning that this change, done naively, would break existing writers
    • The breakage can be limited by forcing custom writers to go through functions controlled by the framework when writing to the stream - loosely, writers delegate writing to the framework using the framework-provided writeValue (ie flatten a custom object type to a json string instead of expanding it into fields) - this would provide the necessary control points to inject "framework stuff". However, direct access to the stream is currently used to create efficient "no-allocation" style serializers which exploit common stream functionality instead of inventing their own. Stream access is a valuable feature!
    • An option is to add a "stream accessor" that does the plumbing and then permits direct stream access.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions