Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web API for base robot outputs 800MB per minute #2106

Open
xsebek opened this issue Aug 10, 2024 · 6 comments
Open

Web API for base robot outputs 800MB per minute #2106

xsebek opened this issue Aug 10, 2024 · 6 comments
Labels
Bug The observed behaviour is incorrect or unexpected. T-Debugging Involves the debugger + related tools T-Web Involves the web interface - generally communicating with Swarm via ports. Z-Performance This issue concerns memory or time efficiency of the Swarm game.

Comments

@xsebek
Copy link
Member

xsebek commented Aug 10, 2024

Describe the bug

Trying to get base log after run "example/list.sw" does not terminate - at least not in any reasonable time.

Thats because the output is flooded with infinite (or exponential) stream of JSON Syntax:

[2902,2904],"_sTerm":{"contents":1,"tag":"TInt"},"_sType":[]}],"tag":"SApp"},"_sType":[]},{"_sComments":{"_afterComments":[],"_beforeComments":[]},"_sLoc":[2906,2909],"_sTerm":{"contents":"acc","tag":"
TVar"},"_sType":[]}],"tag":"SApp"},"_sType":[]}],"tag":"SApp"},"_sType":[]},{"_sComments":{"_afterComments":[],"_beforeComments":[]},"_sLoc":[2911,2918],"_sTerm":{"contents":[{"_sComments":{"_afterComm
ents":[],"_beforeComments":[]},"_sLoc":[2912,2914],"_sTerm":{"contents":[{"_sComments":{"_afterComments":[],"_beforeComments":[]},"_sLoc":"NoLoc","_sTerm":{"contents":"Div","tag":"TConst"},"_sType":[]}
,{"_sComments":{"_afterComments":[],"_beforeComments":[]},"_sLoc":[2912,2914],"_sTerm":{"contents":"i","tag":"TVar"},"_sType":[]}],"tag":"SApp"},"_sType":[]},{"_sComments":{"_afterComments":[],"_before

To Reproduce

Run swarm:

swarm --scenario creative --run "example/list.sw"

Get the output using cURL - I suggest piping to wc for 60 seconds.
Optionally observe the process memory usage with ps, top, or an alternative like btop.

ps -o uid,pid,ppid,ni,vsz,rss,stat,tty,time,command -p $(ps -o pid,comm | grep swarm | cut -f1 -d' ')
#  UID   PID  PPID NI      VSZ    RSS STAT TTY           TIME COMMAND
#  501 85641 44657  0 679915760 140512 S+   ttys003    0:02.74 swarm --scenario creative --run example/list.sw

curl --max-time 60 -s localhost:5357/robot/0 | wc
#       0    5760 874380819

ps -o uid,pid,ppid,ni,vsz,rss,stat,tty,time,command -p $(ps -o pid,comm | grep swarm | cut -f1 -d' ')
#  UID   PID  PPID NI      VSZ    RSS STAT TTY           TIME COMMAND
#  501 85641 44657  0 679923952 12241648 S+   ttys003    1:02.89 swarm --scenario creative --run example/list.sw

Converting to gigabytes:

  • Swarm was using 0.140 GB after loading the file
  • In one minute, robot/0 outputted 0.874 GB of JSON
  • Swarm real-memory usage rose to 12.3 GB - this only got down to 10GB after some time

Expected behavior

I wanted to get base log, i.e. this command should work:

curl -s localhost:5357/robot/0 | yq -P '.log'

Unfortunately, there are gigabytes of JSON of swarm syntax before the log.

Ideally, the syntax ToJSON should be usable, but I could use workarounds like robot/0/log to get the log directly or maybe if it was placed before the syntax, I could extract it and stop cURL output.

Screenshots

Screenshot 2024-08-10 at 12 14 12 PM

Additional context

Given that the list example was only parsed and the functions were not run, this issue likely affects other solutions as well.

The other example files did not cause this magnitude of syntax output, but it's possible some other solutions would.

@xsebek xsebek added Bug The observed behaviour is incorrect or unexpected. Z-Performance This issue concerns memory or time efficiency of the Swarm game. T-Web Involves the web interface - generally communicating with Swarm via ports. T-Debugging Involves the debugger + related tools labels Aug 10, 2024
@xsebek
Copy link
Member Author

xsebek commented Aug 10, 2024

The same thing happens with the sliding puzzle solution:

swarm --scenario data/scenarios/Challenges/Sliding\ Puzzles/3x3.yaml --autoplay

@xsebek
Copy link
Member Author

xsebek commented Aug 10, 2024

For comparison, here are the number of lines of the examples:

for f in example/*; do wc -l $f; done | sort -n
       5 example/omega.sw
       9 example/maybe.sw
      12 example/wander.sw
      13 example/dfs.sw
      13 example/fact.sw
      20 example/multi-key-handler.sw
      24 example/cat.sw
      26 example/pilotmode.sw
      66 example/BFS-clear.sw
     107 example/rectypes.sw
     305 example/list.sw

And here is the length of the output JSON - measured in lines of the indented human readable JSON:

curl -s localhost:5357/robot/0 | jq '.program' | wc -l
    346 swarm --scenario blank --run example/omega.sw
     12 swarm --scenario blank --run example/maybe.sw
    146 swarm --scenario blank --run example/wander.sw
   1699 swarm --scenario blank --run example/dfs.sw
   1223 swarm --scenario blank --run example/fact.sw
   4334 swarm --scenario blank --run example/multi-key-handler.sw
   5838 swarm --scenario blank --run example/cat.sw
     12 swarm --scenario blank --run example/pilotmode.sw
 125887 swarm --scenario blank --run example/BFS-clear.sw
1062053 swarm --scenario blank --run example/rectypes.sw
    ??? swarm --scenario blank --run example/list.sw

@xsebek
Copy link
Member Author

xsebek commented Aug 10, 2024

Maybe we can analyze the rectypes.sw to see what is going on:

swarm --scenario blank --run example/rectypes.sw
curl -s localhost:5357/robot/0 | yq -P '.program' | sed 's/^ *//;s/ *$//' | sort | uniq -c | sort -n

These 3000 repetitions of source positions stand out to me:

3026 - 571
3026 - 575
3026 - 582

Looking in the file the character positions 571, 575 and 582 would correspond to:

def cons : a -> List a -> List a = \x. \l. inr (x, l) end
                                           ^   ^    ^

So cons gets repeated a lot. @byorgey do you have any idea how that would happen?

@byorgey
Copy link
Member

byorgey commented Aug 10, 2024

Yes, I'm pretty sure I know exactly why this is happening, see #1907 (comment) . I don't think there's anything about cons in particular, you just happened to notice that one.

Many continuation stack frames in a robot's CESK machine contain an Env, and they have a lot of shared entries (e.g. in a context with 20 definitions, if you process one more def you now have the same 20 definitions still in scope plus one more). This is not usually a problem in memory, because the shared entries are literally shared: all the Env values just contain pointers. But serializing loses all the sharing.

To solve this we will have to either (1) recover the sharing when serializing somehow, or (2) store things in the first place that makes the sharing more explicit.

@xsebek
Copy link
Member Author

xsebek commented Aug 10, 2024

@byorgey in the linked issue, I meanwhile arrived to the idea (1). 😄

I would very much like (2) but it sounds like a big rewrite. Though it might be necessary if we want to serialize/deserialize robots and (1) turns out to not work. 🤔

@byorgey
Copy link
Member

byorgey commented Aug 10, 2024

I mean, this is also going to be a critical component to #50 , so (2) could be worth it. I think (1) will work, I am just worried about making it efficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug The observed behaviour is incorrect or unexpected. T-Debugging Involves the debugger + related tools T-Web Involves the web interface - generally communicating with Swarm via ports. Z-Performance This issue concerns memory or time efficiency of the Swarm game.
Projects
None yet
Development

No branches or pull requests

2 participants