Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

central data store #115

Closed
oliver-sanders opened this issue Jul 10, 2019 · 16 comments · Fixed by #123
Closed

central data store #115

oliver-sanders opened this issue Jul 10, 2019 · 16 comments · Fixed by #123
Assignees
Milestone

Comments

@oliver-sanders
Copy link
Member

We will be providing multiple "views" on the same workflow[s] (tree, graph, dot, ...) and permitting users to open multiple views simultaneously.

It is quite likely that users might even want duplicate views open, for example I think this will be a common use case:

  • two tree views
    • one showing everything with families collapsed
    • one filtering out only failed tasks with families expanded

To deal with this elegantly whilst minimising server load and browser memory usage it makes sense for the client to maintain one data model which drives the various views (i.e. a standard MTV data model).

This would mean:

  • One model
  • Each view registers subscriptions with that model
  • Subscription queries are "flat" (tasks, jobs, families are kept separate rather than nested)
  • When subscriptions are added / removed all active subscriptions are merged
  • The client subscribes to this merged query
  • An interface for performing one-off queries for any data we don't need to keep up to date

Merging queries is fairly straight forward (especially if the queries are straight forward), see this demo:

https://github.com/oliver-sanders/cylc-ui-graphql-proxy

Implications:

  • Consistency between the data model used by the different views
    • Views must work with the flat data model rather than requesting data in the shape they want
  • Central management of all subscriptions by the client
    • Giving us the potential to do clever things like suspending subscriptions for hidden views (e.g. the gantt view in this sketch)
  • More filtering would be done client-side, less server-side (is this good or bad?)

Some pros some cons.

@oliver-sanders
Copy link
Member Author

@kinow pinging you, I expect it would probably be fairly easy to get the tree view hooked into flat data, what do you think about this approach?

@oliver-sanders oliver-sanders added the question Flag this as a question for the next Cylc project meeting. label Jul 10, 2019
@dwsutherland
Copy link
Member

@oliver-sanders - You could just have a flat list of nodes, and any structured data be just IDs (graph or tree) ... But, yes if the structured view can just be driven off the flat data (containing the relationships) then no problem on the GraphQL front (the request can be multiple queries/mutations, so I would say the subscription also)..

More filtering would be done client-side, less server-side (is this good or bad?)

Main negative would be the amount of data sent over the internet, probably why it's a main feature I suppose... However, you could take a hybrid approach using server-side filters for things that should effect both views?

@oliver-sanders
Copy link
Member Author

However, you could take a hybrid approach using server-side filters for things that should effect both views?

What I was thinking. Might involve being a little more clever at the data store but do-able.

@kinow
Copy link
Member

kinow commented Jul 11, 2019

@oliver-sanders

It is quite likely that users might even want duplicate views open, for example I think this will be a common use case:

I saw your comment in the #116 , on the gscan. It says that the collapsed/expanded in the gscan must match the tree view... but in this case, with multiple tree views for the same suite, I think this wouldn't work?

I kind of prefer to have each component with a separate selection as it is simpler to implement for now, and could be changed later once everything is put in place and is working OK. WDYT?

@kinow
Copy link
Member

kinow commented Jul 11, 2019

@kinow pinging you, I expect it would probably be fairly easy to get the tree view hooked into flat data, what do you think about this approach?

+1 from me, you convinced me in the meetup some weeks ago :-) plus, we just need to push this to a Vuex store, and then the data can be shared by multiple components with no issues.

I guess most of the work will be updating everything to use the flat data, and also filtering. But that shouldn't be a problem.

@kinow
Copy link
Member

kinow commented Jul 11, 2019

Oh, and the current suite tree view is already global. We could open multiple views (with some work on the component, but not worth as it needs updating for flat data, new layout, etc) -

So I think we just need a new value there, like... workflows? Then a Websocket connection would be established with the backend, sending GraphQL, processing responses and updating the workflows object.

Vue.js components would be plugged into this single data store/source, and react upon updates.

@kinow
Copy link
Member

kinow commented Jul 11, 2019

Merging queries is fairly straight forward (especially if the queries are straight forward), see this demo:

https://github.com/oliver-sanders/cylc-ui-graphql-proxy

Thanks a ton for the code example. I think I now understand where we would use it, and started reading about batching and merging.

Have you looked into batching too? Is it a possible alternative for merging the queries? I remember you mentioned something about Apollo having a feature that could be missing in Python Graphene... was it batching?

@matthewrmshin
Copy link
Contributor

In batching, do you get multiple sets of results back?

@sadielbartholomew
Copy link
Contributor

Have you looked into batching too? Is it a possible alternative for merging the queries? I remember you mentioned something about Apollo having a feature that could be missing in Python Graphene... was it batching?

For batching, there is a tried-&-tested package called DataLoader which I think could be very useful for us. I mentioned it in the Exeter June/July discussions, but the conversation direction went elsewhere. There's a good outline with a real example here.

Could we use this?

@kinow
Copy link
Member

kinow commented Jul 11, 2019

AFAICT, in the client side, apollo will group queries submitted within a certain timeout in a single request.

Then on the server side we can automatically execute multiple queries and return the result (multiple queries for 2 tree views open, for example).

Or we can also merge the queries at this stage, on the server.

However, the article I linked above appeared to suggest this wasn't really always the best approach.

This other article has a similar advice.

Although query merging is very effective, it has a few downsides. Primarily, if you’re trying to debug stuff from the server’s point of view, you’ll see queries that look very different from what you initially fired on the client, which could make it harder to debug your queries. One potential solution is described by Lee Byron, one of the creators of GraphQL, in his talk about new GraphQL features: batching at the network transport level.

Though, after reading these posts, I think the query merging approach might work initially, and if we have any problems we can try a different approach?

@kinow
Copy link
Member

kinow commented Jul 11, 2019

I think I remember you mentioning it Sadie, and it is mentioned in some of these articles.

Would be nice to understand pros and cons of merging and of batching. But now I am wondering whether we should do that now, or go with the query proxy suggested by Oliver and compare with the other options

@sadielbartholomew
Copy link
Contributor

Would be nice to understand pros and cons of merging and of batching. But now I am wondering whether we should do that now, or go with the query proxy suggested by Oliver and compare with the other options

Sure. I just wanted to raise it as something to consider, really, since I am not sure how it could tie in with the current plans for the data provision, as it seems to me our thoughts & plans for this have been changing quite quickly of late, with lots of different opinions, etc. Also, it all involves some technical concepts I have only recently understood the basics of. My aim was really to inform you & others with the greatest knowledge & experience on this about it in case you weren't aware!

@oliver-sanders
Copy link
Member Author

batching

My understanding of GraphQL batching is that it simply combines multiple queries (or mutations) into one REST request. As we will be using websockets I don't think we have anything to gain from this.

In batching, do you get multiple sets of results back?

I think so, does anyone know otherwise?

@oliver-sanders
Copy link
Member Author

merging

FYI there is a lot of GraphQL merging stuff out there but all I could find was for schema rather than merging queries. This is so you can have different services prodiving different data but still write a universal query with some client-side magic splitting it up and sending the bits where they need to go.

@oliver-sanders
Copy link
Member Author

but in this case, with multiple tree views for the same suite, I think this wouldn't work?

Not quite sure what you mean here.

I was talking about the way rows get expanded / collapsed we have (at least) three views which use nested trees:

  • tree view
  • dot view
  • gscan

We should share css/js between these for consistency.

@oliver-sanders oliver-sanders self-assigned this Jul 11, 2019
@dwsutherland
Copy link
Member

I need to spend some time reading up on Batching and Data Loader... But I understand the latter eliminates/solves the N+1 problem of nested queries (among other things?) by collating all the resolver calls beforehand or something...

@oliver-sanders oliver-sanders mentioned this issue Jul 12, 2019
3 tasks
@oliver-sanders oliver-sanders added enhancement and removed question Flag this as a question for the next Cylc project meeting. labels Jul 16, 2019
@kinow kinow added this to the 0.1 milestone Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants