Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for new tests(Moves) #305

Closed
thysultan opened this issue Nov 19, 2017 · 41 comments
Closed

Proposal for new tests(Moves) #305

thysultan opened this issue Nov 19, 2017 · 41 comments

Comments

@thysultan
Copy link
Contributor

thysultan commented Nov 19, 2017

The performance of these tests would give a good idea of the number of DOM operations that different framework reconciliation heuristics can guarantee.

  1. move first row to last.
  2. move last row to first.

The worst case for a single move. Frameworks that execute only one DOM op would perform better than ones that move more than one Node to archive the reconciled state.

  1. remove first row + append new row.
  2. remove last row + prepend new row.

Common in infinite list implementations and charts. Tests whether the framework appends and moves multiple Nodes to archive a reconciled state vs a single remove + append operation to archive the least DOM ops and the least DOM state disruption. Frameworks that do the former would intuitively perform better.

  1. reverse list.

Tests what happens when everything moves, similar to the "replace all rows" test but for movement.

@leeoniya
Copy link
Contributor

leeoniya commented Nov 19, 2017

i don't think 5. reverse list is necessary since it's a nested variation of swap rows (for bi-directional reconcilers.). i would expect any libs that get swap rows right to also do well in reversing the list. but perhaps not.

one thing i do think should be considered is to increase the swap rows distance, eg first/last. as discovered recently, maquette does not reuse nodes while swapping [1] (which is another topic of whether it can be considered keyed). the increased distance should accentuate what each lib actually does during the swap. right now mutating 5 nodes in between the two swapped ones is relatively cheap, which would not be the case if 998 nodes had to mutate during a swap.

@thysultan it's probably worth trying the additional mutations and see if they reveal anything new. it's likely not worthwhile to increase the bench runtime to prove out the tautology of "slow frameworks are slow". if any of the additional tests track an existing metric (eg long swap / reverse), then i dont think adding them is necessary.

[1] #270 (comment)

@thysultan
Copy link
Contributor Author

Yes, i think a larger distance between the swapped rows would also archive a similar effect as the reverse list test as demonstrated in #390.

@leeoniya
Copy link
Contributor

leeoniya commented Nov 20, 2017

@thysultan can you do some testing of 1-4 for a few fast and slow libs to see if anything unexpected shows up?

@thysultan
Copy link
Contributor Author

thysultan commented Nov 20, 2017

This forks "new-tests" branch implements them for a few frameworks(react16, domvm, dio, preact, mithril, vanilla, maquette), https://github.com/thysultan/js-framework-benchmark/tree/new-tests.

@leeoniya
Copy link
Contributor

thanks! i'll give them a run. probably worth updating the swap to 1/998 instead of 0/999, too [1]

[1] #306 (comment)

@thysultan
Copy link
Contributor Author

thysultan commented Nov 20, 2017

Current results for that(swap 1/998) and 1 & 2, Haven't yet added 3 & 4 to webdriver tests. Also seems the vanilla impl for 2 is missing something.

localhost_8080_webdriver-ts-results_table html

@thysultan
Copy link
Contributor Author

Managed to get them all working, the sceenshot is a run of the tests once(count 1).

localhost_8080_webdriver-ts-results_table html 1

@leeoniya
Copy link
Contributor

leeoniya commented Nov 20, 2017

nice. but vanilla swap looks off. it was still the fastest for me when i tested. can you doublecheck the changes to it? there's no way domvm beats it.

any new benches here where vanillajs is not the fastest need to be made faster (as imperative as necessary, same as what vanilla swap does).

additionally, i think that the first/last caveat that i found for swap applies to these benchmarks, too. perhaps use second and second-to-last for them as well.

@thysultan
Copy link
Contributor Author

Like the swap case some frameworks might employ a move last->first micro-optimization that might be worth changing it to second last-> second, same for move first->last.

To see which frameworks employ a more general approach.

but vanilla swap looks off.

Yes vanilla is off, since i wasn't concerned about small fluctuations in this test, it was run separately after having issues with its test run the first time round.

@leeoniya
Copy link
Contributor

k, thanks. i'll work off your branch later tonight and see if i can get more "expected" results.

@thysultan
Copy link
Contributor Author

Updated with more "expected" results webdriver-ts-results/table.html

rawgit com_thysultan_js-framework-benchmark_ca86960fe1ceaba30de6dfeda243802bdc5f1f86_webdriver-ts-results_table html

@leeoniya
Copy link
Contributor

leeoniya commented Nov 20, 2017

vanilla startup takes 1.3? that's basically impossible with these frameworks. i think only bobril manages to do better in startup than vanilla (somehow).

are you just doing a single run of everything? everything is ±0.0

@thysultan
Copy link
Contributor Author

Yes count is set to 1. Anyway, i await your observations for a comparison.

@krausest
Copy link
Owner

Sounds interesting - let's see which operations yield new insights. I'm definitely interesting in adding those.

@leeoniya
Copy link
Contributor

i think #306 is on pretty solid footing. i'll be checking the rest tonight tweaked to avoid first/last and with some more frameworks.

@leeoniya
Copy link
Contributor

i'm going to hold off running more tests until #310 is merged since it'll play a big role in determining which additional metrics "yield new insights".

@thysultan
Copy link
Contributor Author

thysultan commented Nov 21, 2017

Improved the new tests to all use the second and second last instead of first row and last, with the addition of vue, ivi and surplus implementations. These are the results:

localhost_8080_webdriver-ts-results_table html

@leeoniya
Copy link
Contributor

looks like move last row tracks swap. can you see if this holds for svelte, whose only Achilles heel seems to be swap perf?

@thysultan
Copy link
Contributor Author

thysultan commented Nov 21, 2017

Sure, here are the results: webdriver-ts-results/table.html.

rawgit com_thysultan_js-framework-benchmark_new-tests_webdriver-ts-results_table html

Image Edit: forgot vue.

@localvoid
Copy link
Contributor

localvoid commented Nov 22, 2017

@thysultan

Like the swap case some frameworks might employ a move last->first micro-optimization that might be worth changing it to second last-> second, same for move first->last.

Many libraries are detecting common prefix/suffix before moving nodes at the edges(almost all fast vdom libraries), but they will fail when first/last nodes are removed and then nodes are swapped at the edges. But even better is to just move one node to the opposite edge instead of swapping, to demonstrate that linear algos are failing in such cases https://github.com/localvoid/uibench-base/blob/efacae672bbf4133360a928305876ee1de643e64/lib/uibench.ts#L302-L319

In the latest ivi (>=0.9) I've removed this optimization that detects moves on the edges because it can cause unnecessary moves when adjacent node is inserted/removed. Now ivi just detects common prefix/suffix and then uses generic algo that finds minimum number of DOM ops, and guarantees that inserts/removes won't cause any adjacent nodes to move.

@leeoniya
Copy link
Contributor

great, now we're gonna have to suck in all of uibench :D

i suspect list reversal is gonna destroy the majority of the libs that already do poorly on the 1/998 swap, so it's not gonna yield anything new but will skyrocket the bench time. imo it makes sense to add some more tests that can quickly show slowness without incurring major runtime costs really for no reason. for example, the move last row might be fairly uninteresting if it exactly tracks the existing swap timings.

i'll try out some of those "worst case scenarios" on domvm and some others, but list reversal here on 1k (instead of uibench's 50) is gonna be a no-go if many libs burn through my laptop case.

@localvoid
Copy link
Contributor

localvoid commented Nov 22, 2017

@leeoniya

List reversal+remove first/last is unnecessary, it was the easiest way to trigger LIS algo in kivi. Now any move will trigger LIS algo in ivi, I think that predictable behavior is more important than this micro optimization.

Remove first/last + move node in the opposite direction is enough to trigger worst cases in most libraries, and I think that some libraries(virtual-dom, maybe elm) are using lookahead technique and they would require to move two nodes in opposite direction to trigger worst case scenario.

snabbdom special case is also unnecessary, I just thought that it behave slightly differently when I wrote this case, now I think that snabbdom has one of the worst algorithms out there.

@thysultan
Copy link
Contributor Author

Yes, swap row is a combo of the both "move last" and "move first", that said the presence of these are beneficial in helping lib authors/users understand which move direction the framework is performing poorly in.

From the above results it looks like the opposite direction(upward) is what is contributing to the poor "swap row" times for those frameworks – that info should at least be of use to lib authors to know where to improve.

The "remove last and prepend" variants better show what happens when a framework cannot employ the common prefix/suffix/edges optimization.

@localvoid
Copy link
Contributor

localvoid commented Nov 22, 2017

Yes, swap row is a combo of the both "move last" and "move first"

No, many linear algorithms will detect that first and last node in the wrong positions and everyone else at the correct positions. When moving just one node, all nodes except one will be in the wrong positions (depends on the direction of the linear algo).

@thysultan
Copy link
Contributor Author

thysultan commented Nov 22, 2017

many linear algorithms will detect that first and last node in the wrong positions and everyone else at the correct positions

Yes, i expected that to be the case when both edge optimizations are in play. Though it seems like all the frameworks that do perform poorly on "swap rows" within the above sample also perform poorly in a single move in the opposite direction.

I also moved away from the common edges optimization in DIO for the exact reasons mentioned before – A, B -> C, A.

@leeoniya
Copy link
Contributor

i implemented a couple of the worst case scenarios for ivi, vanilla, domvm.

kiviWorstCase (trim first & last, then reverse list):

  • vanilla: 28.5ms (using a insertBefore/childNodes/firstChild loop, so might not be optimal)
  • ivi: 52.7ms
  • domvm: 33.6ms

snabbdomWorstCase (move first and second-to-last to end):

  • vanilla: 0.6ms
  • ivi: 25.5ms
  • domvm: 65.3ms

so basically, there isn't a deep enough red in this benchmark to properly represent even the fastest reconcile algo for snabbdomWorstCase without moving to some logarithmic color scale. i'll post more thorough results later.

@localvoid
Copy link
Contributor

localvoid commented Nov 22, 2017

Yes, i expected that to be the case when both edge optimizations are in play.

Even without edge optimizations, some linear algos should be able to easily handle swap use cases. All nodes between swapped nodes will have exactly the same position that were in the previous list, so it won't touch them, but if you move one node, all nodes will be shifted by one. And depending on the order this algo traverses list, worst case can be either move to the end of the list or vice versa.

@thysultan
Copy link
Contributor Author

You're right, "move" vs "swap" can present different results depending on the algorithm used by the reconciler.

@thysultan
Copy link
Contributor Author

@leeoniya vanilla 0.6ms doesn't look right, i imagine it should be closer to "move (first/last) row". Do you have a link to the fork with these implementation?

@leeoniya
Copy link
Contributor

leeoniya commented Nov 22, 2017

wip patch is attached.

i should have clarified. this was scripting time only. the full metrics for snabbdomWorstCase were:

vanilla:
0.5   scripting
49.5  rendering
8.3   painting
13.8  other

ivi:
14.1  scripting
59.7  rendering
8.5   painting
7.5   other

domvm:
49.5  scripting
110.8 rendering
3.8   painting
2.5   other

the main point i'm trying to make, i guess is #316. there can never be vanilla-levels of scripting code in actual data-driven apps, so the baseline of "0 scripting" is unrealistic, though not to the degree that i indicated, sry.

if you just took the raw numbers, it would seem that ivi is 41% slower than vanilla: (14.1 + 59.7 + 8.5) / (0.5 + 49.5 + 8.3). But that's not true because no actual apps can do vanilla's 0.5 scripting time. The baseline should be the fastest data-driven reconciler (which may in fact be ivi or petit)

0001-wip.zip

@eavichay
Copy link
Contributor

eavichay commented Nov 23, 2017

Can this benchmark have more focus on real world use cases? I wonder what is the use-case for replacing 1/999. I can see use case for alphabetical sort though (which should be similar to replace all rows in non-keyed, but perhaps we can find surprises in keyed implementations). I can see use case for replacing two adjacent rows. Can't see what's the use of swapping 1/999 as opposed to 4/9.

@thysultan
Copy link
Contributor Author

A list of data could be sorted such that it is a single swap operation away from being sorted alphabetically, or generally any X operation – for example within a list that could be randomly ordered via user input of other variables there is an near-infinite number of order-states it could be in before a sort operation(i.e alphabetically) is applied to it. So switching the second and second-last falls within a real-world case.

That said the operations listed are an attempt to be a litmus test to give authors an idea of what the libs reconcile heuristics can guarantee and as such what areas it can improve. The underlining tone of this is that moving more DOM nodes than necessary affects performance but more so also affects DOM state.

@eavichay
Copy link
Contributor

eavichay commented Nov 23, 2017

@thysultan Sorting alphabetically is a "real-world" use-case. It can be converted to random-shuffle the list (but keep the objects intact). This should make trouble for non-keyed but probably should be handled better by keyed implementations. In order to keep the tests fair - the shuffle algorithm should be implemented in the store object, and it makes sure ALL elements are swapped. I don't mind even if the swapping would be hard-coded to keep it fair.

@thysultan
Copy link
Contributor Author

@eavichay Sorting alphabetically looks like it would put the vanillajs implementation closer to other libs(if it's a random-shuffle), but as an author i would loose the ability to pin-point/reproduce at what exact operation/pattern my lib is performing bad in.

@eavichay
Copy link
Contributor

@thysultan I'm not sure what you're getting at. I am suggesting a test for "swap all rows - keep data". Perhaps an Array.reverse may do the trick, but I think it would be too easy to "cheat". Shuffling, then, would make sure the test is fair.

@thysultan
Copy link
Contributor Author

@eavichay My concern is that if you can't reproduce a shuffle, there would be a lose of useful data on what steps lead to a particular result, data that a lib author could use to improve.

@eavichay
Copy link
Contributor

@thysultan You can reproduce a shuffle (either using a constant seed which may cause too much javascript code) or simply writing a script that shuffles 1K items, produce the results and I could implement it hardcoded.
Another option is to reverse items in the list (in partitions of n-size)

@thysultan
Copy link
Contributor Author

@eavichay Can you try this for a few frameworks and see what data picture is painted from that?

My only suggestion is to probably only ever move < 50% of the nodes in order to get a picture which libs preserve DOM state of the nodes that don't move, < 50% because the problem with "move everything" is that the concept of least number of DOM operations to get to reconciled state doesn't exist when everything has moved which conceptually would for the most part be the same for each framework.

@leeoniya
Copy link
Contributor

after doing a healthy amount of uibench testing, i think @localvoid's "worst case" scenarios [1] are a good next step.

unlike uibench, this benchmark is BYO store implementation as well as needing a UI button for each test case. these requirements make updating all implementations a very substantial task.

if it was possible to avoid having to add a UI button & event binding for each additional metric, the task would still be tedious, but digestible. one way to do this would be to expose functions globally from each impl and have the bench runner invoke them as eg window.snabbdomWorstCase(). it's probably not sustainable to continue growing the UI for every possible store mutation added to this benchmark.

@krausest thoughts?

[1] https://github.com/localvoid/uibench-base/blob/efacae672bbf4133360a928305876ee1de643e64/lib/uibench.ts#L302-L319

@Havunen
Copy link
Contributor

Havunen commented Jul 25, 2019

It would be nice to have @localvoid 's uibench "moves" added here into this benchmark.

@krausest
Copy link
Owner

I'm closing older issues that won't get fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants