Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse environment definitions in ToJSON and FromJSON #2107

Open
xsebek opened this issue Aug 10, 2024 · 7 comments
Open

Reuse environment definitions in ToJSON and FromJSON #2107

xsebek opened this issue Aug 10, 2024 · 7 comments
Labels
Z-Feature A new feature to be added to the game.

Comments

@xsebek
Copy link
Member

xsebek commented Aug 10, 2024

Is your feature request related to a problem? Please describe.

TLDR: the problem is that _envVars inside _envVars cause exponential JSON size.

Take this simple example:

def m2 = m; m end
def m4 = m2; m2 end
def m8 = m4; m4 end
def m16 = m8; m8 end
def m32 = m16; m16 end

And observe the growth of JSON output:

empty m2 m4 m8 m16 m32 m1024
5 27 78 180 384 792 52200

This makes it impossible to get to other useful parts of robot JSON, like the log, if the program has enough definitions:

Describe the solution you'd like

The definitions should be reused - the inner references should link ({"link": "m8"}) to outer definition.

If we can check that the definitions are the same, we could prune them from inner scope:

- deriving instance FromJSON Env
- deriving instance ToJSON Env
+ instance FromJSONE Env Env
+ instance ToJSON (Env, Env)  -- or some newtype WithEnv

Describe alternatives you've considered

The current derived instance is broken, so maybe we could only keep the top environment, or none at all.

@xsebek xsebek added the Z-Feature A new feature to be added to the game. label Aug 10, 2024
@byorgey
Copy link
Member

byorgey commented Aug 10, 2024

This may actually be a good application for the (heretofore mythical) ToJSONE. I just want to make sure we avoid doing $O(n^2)$ work to repeatedly check the same definitions for equality.

@xsebek
Copy link
Member Author

xsebek commented Aug 10, 2024

I think even $O(n^2)$ would be OK, since it would remove exponentially many inner references.

Outputting JSON is not done on main thread, so it would not freeze the game.

@byorgey
Copy link
Member

byorgey commented Aug 10, 2024

Good point. Yes, we should not succumb to premature optimization here. Let's start with the simplest thing that works and we can optimize it later if necessary.

@byorgey
Copy link
Member

byorgey commented Oct 22, 2024

After some discussion with @xsebek on Discord, here's a sketch of an idea. Env is just made up of a bunch of Ctx (which almost, but not quite always, have all the same variables). So we can focus first on making this sort of sharing/recovery work for Ctx.

Currently Ctx t is just defined as a newtype for Map Var t. The proposal is to keep this, but add some extra structure that allows us to remember the structure of how the Ctx was built, and quickly identify when two Ctx values are equal, without having to actually compare them:

data CtxStruct t = CtxEmpty | CtxSingle Var | CtxDelete Var (Ctx t) | CtxUnion (Ctx t) (Ctx t)
data Ctx t = Ctx { ctxMap :: Map Var t, ctxName :: CtxName, ctxStruct :: CtxStruct t }

The ctxMap would only be used for looking up variables efficiently, but never for serializing. The ctxName (assuming it is unique enough) can be used to quickly test Ctx values for equality. The ctxStruct explains the structure of how the Ctx was built (either empty, or as a singleton, or by deleting a variable from a Ctx, or as a union of two other Ctx) so that we can disassemble/serialize it effectively.

To generate unique ctxNames, we could either (1) hash the contents of of the context, or (2) require operations to take place in a monad that has a unique-symbol-generation effect.

I had been thinking in terms of storing each tree node indexed by name in a map, or something horrendous like that. It was @xsebek's idea that all we need to do is store some unique names alongside just so we can use them to compare for equality.

When serializing, we can keep track of a set of CtxNames we've seen so far, and essentially output a map from CtxName to one level of CtxStruct - i.e. each name maps to either a single binding, or a pair of CtxNames. To reconstruct a Ctx we read in a map from CtxName to Ctx and build the actual trees + Maps lazily as we go.

@xsebek
Copy link
Member Author

xsebek commented Oct 24, 2024

@byorgey this sounds great! 👍 Will this also work for type context, or will it use the old version? 🤔

@byorgey
Copy link
Member

byorgey commented Oct 24, 2024

Type contexts also use Ctx, so yes, it will work for those too.

@byorgey
Copy link
Member

byorgey commented Oct 26, 2024

Working on a proof of concept, watch for a PR soon... 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Z-Feature A new feature to be added to the game.
Projects
None yet
Development

No branches or pull requests

2 participants