Runner state setting and getting #2

mapto · 2022-01-10T16:02:35Z

I'm putting forward a first suggestion of state manipulation that would allow persisting the story state externally, as discussed in Issue #1

I don't expect this solution to be accepted as it is, it's rather a discussion of the approach. I preferred constructor and repr() interface, because another save/load functions would have to export in/import from a complex data structure that would have to be represented in a format (new class or JSON, etc) that is yet to be introduced. The proposed approach is probably more pythonic, relying on built-in functions. Of course a more efficient way to export is to access the field values, rather than the repr() string. Yet, repr() enables the reflection pattern.

Also, my interpretation of state is rather naive:

I need use cases to better decide whether I should be including line_buffer, option_buffer, vm_data_stack, vm_instruction_stack and program_counter.
Command handlers are also not really persisted, as commands themselves are external to the runner

Let me know what your thoughts are and I will adapt the solution.

weaversam8 · 2022-01-11T03:39:49Z

Thanks so much for the PR! I want to warn you that it will take me longer than normal (probably about two weeks) before I can review and comment on this, but I'm excited to do so once I get the chance!

mapto · 2022-01-15T21:24:07Z

I'm having problems serializing and deserializing the compiled_yarn or the vm_instruction_stack. Which one makes sense to persist? Any examples (I am trying with protobuf's SerializeToString/ParseFromString, but doesn't work)?Any hints you could give me?

mapto · 2022-01-15T21:31:22Z

Sorry, just saw this:
https://github.com/relaypro-open/YarnRunner-Python/blob/main/yarnrunner_python/runner.py#L113
Should do the job for me.

mapto · 2022-01-19T14:11:23Z

I've expanded my state export with other attributes, including merging your changes. However, there's still something wrong with the state (see (this test)[https://github.com/mapto/YarnRunner-Python/blob/main/tests/test_shortcuts.py#L63] that fails on line 92). It seems that after loading the lines are not picking up from where they were left. Instead they restart. Could you give me a hand identifying what's the storage that I'm missing? Thanks!

PS: I will now change the persistence to JSON, as it seems to be the format that's already partially used. However, it would be pointless if I am not capturing the complete state.

weaversam8 · 2022-01-24T20:16:28Z

Thanks for working on a first pass at this @mapto! I really appreciate you putting this forward, and look forward to working through it with you to get it merged.

A couple of things:

Let's definitely not use repr()/eval() to save state- as nice as it is I don't like the idea of opening a hole for someone to try to sneak arbitrary code execution into save state files. JSON as you mentioned should be fine.
I don't think we necessarily need to use a constructor approach where we include every piece of internal data as a parameter. I'd prefer we either:
1. rely on **kwargs without explicitly specifying parameter names in the function signature or,
2. create a load() function (to go with the save() function that will replace repr()) that loads state data into an already constructed runner object
  - I prefer (ii) since it means save state data can be loaded into an already constructed instance. Though it means you have to construct and then load data, that's no big deal, since you can just start one with autostart=False.
In concert with this, we shouldn't require people to serialize the program or string table in the save file. In many cases, the program will be the same across a large number of save states, and serializing the program will make saved states larger than they need to be.
- However, this means that we'll need some way to verify that the version of the program used to load a saved state matches (or is compatible with) the version of the program used to save a state. We should include some sort of "program version" in the saved state, which we can verify on load. For now, a simple MD5 hash of the program and string table files should suffice, we can build a less brittle program version indicator in the future if necessary.

Overall, I think, the major information that should be serialized in the save state should be:

visits
variables
the line buffer
the option buffer
the VM's data stack
the VM's instruction stack
the current program counter
the previous instruction
whether or not execution has finished
- (this last one may seem counterintuitive, but I'd like the ability to save a finished run and reference it later. in the current time, the only really valuable information after a finished run would be the visits and variables, but I expect to implement tracing of node paths in the future, which might give us another reason to save a completed execution.)

The major information that should not be serialized is probably:

the program file
the string tables
other constructor parameters (enable_tracing, newlines, etc)
command handlers

How do you feel about these suggestions? Do you feel comfortable making these changes?

mapto · 2022-01-24T21:08:41Z

Thanks for the feedback!

At a personal level, my team dropped out from the game jam, yet both our story writer and I are interested to continue to explore yarn. So this means I can continue to work on this PR, but will be able to dedicate less time in February.

Now regarding solutions: Certainly agree that JSON is a better solution and I would readily do it the way you describe. However, I already tried serializing both the information you indicated as should and the one that you indicated as shouldn't and my test with shortcuts continues to fail, so I am sceptical that switching to JSON would take us any forward.

Given that it appears to me that explicit state serialization seems to miss something that remains unidentified, I am wondering if it makes sense to try serializing with (pickle)[https://docs.python.org/3/library/pickle.html] which shouldn't leave any chances.

Anyway, I'll try with JSON and if that same test continues to fail we can discuss that other option.

mapto · 2022-01-24T22:00:35Z

Good news: a quick implementation with JSON passed the previously failing test.

TODO: the hash function for version comparison and wider testing

mapto added 2 commits January 9, 2022 23:03

added optional constructor parameters and repr method

4176c2c

Added test coverage for constructor and repr()

3635f9b

mapto changed the title ~~Runner state setting and getting (addressing issue #1)~~ Runner state setting and getting Jan 10, 2022

Improved repr for command_handlers; improved tests

6232731

mapto added 5 commits January 19, 2022 13:50

files alternative to jsons

63114bb

Merge branch 'main' of github.com:mapto/YarnRunner-Python into main

ad351cb

removed .orig files included by mistake

a5e2c4c

Resolved conflicts with original

c500613

Merge branch 'realplay-open' into main

7296682

Switched to json serialization

66b44d1

Added current_node in export; had forgotten it

b09d63b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runner state setting and getting #2

Runner state setting and getting #2

mapto commented Jan 10, 2022 •

edited

Loading

weaversam8 commented Jan 11, 2022

mapto commented Jan 15, 2022

mapto commented Jan 15, 2022

mapto commented Jan 19, 2022

weaversam8 commented Jan 24, 2022

mapto commented Jan 24, 2022

mapto commented Jan 24, 2022

Runner state setting and getting #2

Are you sure you want to change the base?

Runner state setting and getting #2

Conversation

mapto commented Jan 10, 2022 • edited Loading

weaversam8 commented Jan 11, 2022

mapto commented Jan 15, 2022

mapto commented Jan 15, 2022

mapto commented Jan 19, 2022

weaversam8 commented Jan 24, 2022

mapto commented Jan 24, 2022

mapto commented Jan 24, 2022

mapto commented Jan 10, 2022 •

edited

Loading