Reuse environment definitions in `ToJSON` and `FromJSON` #2107

xsebek · 2024-08-10T14:02:17Z

Is your feature request related to a problem? Please describe.

TLDR: the problem is that _envVars inside _envVars cause exponential JSON size.

Take this simple example:

def m2 = m; m end
def m4 = m2; m2 end
def m8 = m4; m4 end
def m16 = m8; m8 end
def m32 = m16; m16 end

And observe the growth of JSON output:

empty	m2	m4	m8	m16	m32	m1024
5	27	78	180	384	792	52200

This makes it impossible to get to other useful parts of robot JSON, like the log, if the program has enough definitions:

Web API for base robot outputs 800MB per minute #2106

Describe the solution you'd like

The definitions should be reused - the inner references should link ({"link": "m8"}) to outer definition.

If we can check that the definitions are the same, we could prune them from inner scope:

- deriving instance FromJSON Env
- deriving instance ToJSON Env
+ instance FromJSONE Env Env
+ instance ToJSON (Env, Env)  -- or some newtype WithEnv

Describe alternatives you've considered

The current derived instance is broken, so maybe we could only keep the top environment, or none at all.

The text was updated successfully, but these errors were encountered:

byorgey · 2024-08-10T14:28:21Z

This may actually be a good application for the (heretofore mythical) ToJSONE. I just want to make sure we avoid doing $O(n^2)$ work to repeatedly check the same definitions for equality.

xsebek · 2024-08-10T14:37:30Z

I think even $O(n^2)$ would be OK, since it would remove exponentially many inner references.

Outputting JSON is not done on main thread, so it would not freeze the game.

byorgey · 2024-08-10T14:50:59Z

Good point. Yes, we should not succumb to premature optimization here. Let's start with the simplest thing that works and we can optimize it later if necessary.

byorgey · 2024-10-22T01:42:05Z

After some discussion with @xsebek on Discord, here's a sketch of an idea. Env is just made up of a bunch of Ctx (which almost, but not quite always, have all the same variables). So we can focus first on making this sort of sharing/recovery work for Ctx.

Currently Ctx t is just defined as a newtype for Map Var t. The proposal is to keep this, but add some extra structure that allows us to remember the structure of how the Ctx was built, and quickly identify when two Ctx values are equal, without having to actually compare them:

data CtxStruct t = CtxEmpty | CtxSingle Var | CtxDelete Var (Ctx t) | CtxUnion (Ctx t) (Ctx t)
data Ctx t = Ctx { ctxMap :: Map Var t, ctxName :: CtxName, ctxStruct :: CtxStruct t }

The ctxMap would only be used for looking up variables efficiently, but never for serializing. The ctxName (assuming it is unique enough) can be used to quickly test Ctx values for equality. The ctxStruct explains the structure of how the Ctx was built (either empty, or as a singleton, or by deleting a variable from a Ctx, or as a union of two other Ctx) so that we can disassemble/serialize it effectively.

To generate unique ctxNames, we could either (1) hash the contents of of the context, or (2) require operations to take place in a monad that has a unique-symbol-generation effect.

I had been thinking in terms of storing each tree node indexed by name in a map, or something horrendous like that. It was @xsebek's idea that all we need to do is store some unique names alongside just so we can use them to compare for equality.

When serializing, we can keep track of a set of CtxNames we've seen so far, and essentially output a map from CtxName to one level of CtxStruct - i.e. each name maps to either a single binding, or a pair of CtxNames. To reconstruct a Ctx we read in a map from CtxName to Ctx and build the actual trees + Maps lazily as we go.

xsebek · 2024-10-24T20:29:48Z

@byorgey this sounds great! 👍 Will this also work for type context, or will it use the old version? 🤔

byorgey · 2024-10-24T21:39:46Z

Type contexts also use Ctx, so yes, it will work for those too.

byorgey · 2024-10-26T21:59:00Z

Working on a proof of concept, watch for a PR soon... 😃

xsebek added the Z-Feature A new feature to be added to the game. label Aug 10, 2024

byorgey added a commit that referenced this issue Oct 26, 2024

[wip] Towards #2107; add structure + homomorphic hashes to Ctx

d8d4f08

byorgey mentioned this issue Nov 5, 2024

Contexts that can be serialized + deserialized while retaining and explicitly representing sharing #2202

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse environment definitions in `ToJSON` and `FromJSON` #2107

Reuse environment definitions in `ToJSON` and `FromJSON` #2107

xsebek commented Aug 10, 2024

byorgey commented Aug 10, 2024

xsebek commented Aug 10, 2024

byorgey commented Aug 10, 2024

byorgey commented Oct 22, 2024 •

edited

Loading

xsebek commented Oct 24, 2024

byorgey commented Oct 24, 2024

byorgey commented Oct 26, 2024

Reuse environment definitions in ToJSON and FromJSON #2107

Reuse environment definitions in ToJSON and FromJSON #2107

Comments

xsebek commented Aug 10, 2024

byorgey commented Aug 10, 2024

xsebek commented Aug 10, 2024

byorgey commented Aug 10, 2024

byorgey commented Oct 22, 2024 • edited Loading

xsebek commented Oct 24, 2024

byorgey commented Oct 24, 2024

byorgey commented Oct 26, 2024

Reuse environment definitions in `ToJSON` and `FromJSON` #2107

Reuse environment definitions in `ToJSON` and `FromJSON` #2107

byorgey commented Oct 22, 2024 •

edited

Loading