Proposal for new tests(Moves) #305

thysultan · 2017-11-19T11:39:34Z

The performance of these tests would give a good idea of the number of DOM operations that different framework reconciliation heuristics can guarantee.

move first row to last.
move last row to first.

The worst case for a single move. Frameworks that execute only one DOM op would perform better than ones that move more than one Node to archive the reconciled state.

remove first row + append new row.
remove last row + prepend new row.

Common in infinite list implementations and charts. Tests whether the framework appends and moves multiple Nodes to archive a reconciled state vs a single remove + append operation to archive the least DOM ops and the least DOM state disruption. Frameworks that do the former would intuitively perform better.

reverse list.

Tests what happens when everything moves, similar to the "replace all rows" test but for movement.

leeoniya · 2017-11-19T18:36:42Z

i don't think 5. reverse list is necessary since it's a nested variation of swap rows (for bi-directional reconcilers.). i would expect any libs that get swap rows right to also do well in reversing the list. but perhaps not.

one thing i do think should be considered is to increase the swap rows distance, eg first/last. as discovered recently, maquette does not reuse nodes while swapping [1] (which is another topic of whether it can be considered keyed). the increased distance should accentuate what each lib actually does during the swap. right now mutating 5 nodes in between the two swapped ones is relatively cheap, which would not be the case if 998 nodes had to mutate during a swap.

@thysultan it's probably worth trying the additional mutations and see if they reveal anything new. it's likely not worthwhile to increase the bench runtime to prove out the tautology of "slow frameworks are slow". if any of the additional tests track an existing metric (eg long swap / reverse), then i dont think adding them is necessary.

[1] #270 (comment)

thysultan · 2017-11-20T04:03:29Z

Yes, i think a larger distance between the swapped rows would also archive a similar effect as the reverse list test as demonstrated in #390.

leeoniya · 2017-11-20T04:32:47Z

@thysultan can you do some testing of 1-4 for a few fast and slow libs to see if anything unexpected shows up?

thysultan · 2017-11-20T09:02:27Z

This forks "new-tests" branch implements them for a few frameworks(react16, domvm, dio, preact, mithril, vanilla, maquette), https://github.com/thysultan/js-framework-benchmark/tree/new-tests.

leeoniya · 2017-11-20T17:22:37Z

thanks! i'll give them a run. probably worth updating the swap to 1/998 instead of 0/999, too [1]

[1] #306 (comment)

thysultan · 2017-11-20T17:31:10Z

Current results for that(swap 1/998) and 1 & 2, Haven't yet added 3 & 4 to webdriver tests. Also seems the vanilla impl for 2 is missing something.

thysultan · 2017-11-20T18:06:27Z

Managed to get them all working, the sceenshot is a run of the tests once(count 1).

leeoniya · 2017-11-20T18:13:07Z

nice. but vanilla swap looks off. it was still the fastest for me when i tested. can you doublecheck the changes to it? there's no way domvm beats it.

any new benches here where vanillajs is not the fastest need to be made faster (as imperative as necessary, same as what vanilla swap does).

additionally, i think that the first/last caveat that i found for swap applies to these benchmarks, too. perhaps use second and second-to-last for them as well.

thysultan · 2017-11-20T18:18:25Z

Like the swap case some frameworks might employ a move last->first micro-optimization that might be worth changing it to second last-> second, same for move first->last.

To see which frameworks employ a more general approach.

but vanilla swap looks off.

Yes vanilla is off, since i wasn't concerned about small fluctuations in this test, it was run separately after having issues with its test run the first time round.

leeoniya · 2017-11-20T18:24:00Z

k, thanks. i'll work off your branch later tonight and see if i can get more "expected" results.

thysultan · 2017-11-20T18:50:10Z

Updated with more "expected" results webdriver-ts-results/table.html

leeoniya · 2017-11-20T18:54:59Z

vanilla startup takes 1.3? that's basically impossible with these frameworks. i think only bobril manages to do better in startup than vanilla (somehow).

are you just doing a single run of everything? everything is ±0.0

thysultan · 2017-11-20T19:00:38Z

Yes count is set to 1. Anyway, i await your observations for a comparison.

krausest · 2017-11-20T20:49:58Z

Sounds interesting - let's see which operations yield new insights. I'm definitely interesting in adding those.

leeoniya · 2017-11-20T21:00:44Z

i think #306 is on pretty solid footing. i'll be checking the rest tonight tweaked to avoid first/last and with some more frameworks.

leeoniya · 2017-11-21T03:54:24Z

i'm going to hold off running more tests until #310 is merged since it'll play a big role in determining which additional metrics "yield new insights".

thysultan · 2017-11-21T10:18:04Z

Improved the new tests to all use the second and second last instead of first row and last, with the addition of vue, ivi and surplus implementations. These are the results:

leeoniya · 2017-11-21T15:01:32Z

looks like move last row tracks swap. can you see if this holds for svelte, whose only Achilles heel seems to be swap perf?

thysultan · 2017-11-21T15:28:23Z

Sure, here are the results: webdriver-ts-results/table.html.

Image Edit: forgot vue.

localvoid · 2017-11-22T00:49:44Z

@thysultan

Like the swap case some frameworks might employ a move last->first micro-optimization that might be worth changing it to second last-> second, same for move first->last.

Many libraries are detecting common prefix/suffix before moving nodes at the edges(almost all fast vdom libraries), but they will fail when first/last nodes are removed and then nodes are swapped at the edges. But even better is to just move one node to the opposite edge instead of swapping, to demonstrate that linear algos are failing in such cases https://github.com/localvoid/uibench-base/blob/efacae672bbf4133360a928305876ee1de643e64/lib/uibench.ts#L302-L319

In the latest ivi (>=0.9) I've removed this optimization that detects moves on the edges because it can cause unnecessary moves when adjacent node is inserted/removed. Now ivi just detects common prefix/suffix and then uses generic algo that finds minimum number of DOM ops, and guarantees that inserts/removes won't cause any adjacent nodes to move.

leeoniya · 2017-11-22T03:23:44Z

great, now we're gonna have to suck in all of uibench :D

i suspect list reversal is gonna destroy the majority of the libs that already do poorly on the 1/998 swap, so it's not gonna yield anything new but will skyrocket the bench time. imo it makes sense to add some more tests that can quickly show slowness without incurring major runtime costs really for no reason. for example, the move last row might be fairly uninteresting if it exactly tracks the existing swap timings.

i'll try out some of those "worst case scenarios" on domvm and some others, but list reversal here on 1k (instead of uibench's 50) is gonna be a no-go if many libs burn through my laptop case.

localvoid · 2017-11-22T05:13:55Z

@leeoniya

List reversal+remove first/last is unnecessary, it was the easiest way to trigger LIS algo in kivi. Now any move will trigger LIS algo in ivi, I think that predictable behavior is more important than this micro optimization.

Remove first/last + move node in the opposite direction is enough to trigger worst cases in most libraries, and I think that some libraries(virtual-dom, maybe elm) are using lookahead technique and they would require to move two nodes in opposite direction to trigger worst case scenario.

snabbdom special case is also unnecessary, I just thought that it behave slightly differently when I wrote this case, now I think that snabbdom has one of the worst algorithms out there.

thysultan · 2017-11-22T05:16:52Z

Yes, swap row is a combo of the both "move last" and "move first", that said the presence of these are beneficial in helping lib authors/users understand which move direction the framework is performing poorly in.

From the above results it looks like the opposite direction(upward) is what is contributing to the poor "swap row" times for those frameworks – that info should at least be of use to lib authors to know where to improve.

The "remove last and prepend" variants better show what happens when a framework cannot employ the common prefix/suffix/edges optimization.

localvoid · 2017-11-22T05:31:06Z

Yes, swap row is a combo of the both "move last" and "move first"

No, many linear algorithms will detect that first and last node in the wrong positions and everyone else at the correct positions. When moving just one node, all nodes except one will be in the wrong positions (depends on the direction of the linear algo).

thysultan · 2017-11-22T05:57:25Z

many linear algorithms will detect that first and last node in the wrong positions and everyone else at the correct positions

Yes, i expected that to be the case when both edge optimizations are in play. Though it seems like all the frameworks that do perform poorly on "swap rows" within the above sample also perform poorly in a single move in the opposite direction.

I also moved away from the common edges optimization in DIO for the exact reasons mentioned before – A, B -> C, A.

leeoniya · 2017-11-22T06:40:34Z

i implemented a couple of the worst case scenarios for ivi, vanilla, domvm.

kiviWorstCase (trim first & last, then reverse list):

vanilla: 28.5ms (using a insertBefore/childNodes/firstChild loop, so might not be optimal)
ivi: 52.7ms
domvm: 33.6ms

snabbdomWorstCase (move first and second-to-last to end):

vanilla: 0.6ms
ivi: 25.5ms
domvm: 65.3ms

so basically, there isn't a deep enough red in this benchmark to properly represent even the fastest reconcile algo for snabbdomWorstCase without moving to some logarithmic color scale. i'll post more thorough results later.

localvoid · 2017-11-22T11:19:02Z

Yes, i expected that to be the case when both edge optimizations are in play.

Even without edge optimizations, some linear algos should be able to easily handle swap use cases. All nodes between swapped nodes will have exactly the same position that were in the previous list, so it won't touch them, but if you move one node, all nodes will be shifted by one. And depending on the order this algo traverses list, worst case can be either move to the end of the list or vice versa.

thysultan · 2017-11-22T13:52:20Z

You're right, "move" vs "swap" can present different results depending on the algorithm used by the reconciler.

thysultan · 2017-11-22T18:02:28Z

@leeoniya vanilla 0.6ms doesn't look right, i imagine it should be closer to "move (first/last) row". Do you have a link to the fork with these implementation?

leeoniya · 2017-11-22T20:22:35Z

wip patch is attached.

i should have clarified. this was scripting time only. the full metrics for snabbdomWorstCase were:

vanilla:
0.5   scripting
49.5  rendering
8.3   painting
13.8  other

ivi:
14.1  scripting
59.7  rendering
8.5   painting
7.5   other

domvm:
49.5  scripting
110.8 rendering
3.8   painting
2.5   other

the main point i'm trying to make, i guess is #316. there can never be vanilla-levels of scripting code in actual data-driven apps, so the baseline of "0 scripting" is unrealistic, though not to the degree that i indicated, sry.

if you just took the raw numbers, it would seem that ivi is 41% slower than vanilla: (14.1 + 59.7 + 8.5) / (0.5 + 49.5 + 8.3). But that's not true because no actual apps can do vanilla's 0.5 scripting time. The baseline should be the fastest data-driven reconciler (which may in fact be ivi or petit)

0001-wip.zip

eavichay · 2017-11-23T06:57:30Z

Can this benchmark have more focus on real world use cases? I wonder what is the use-case for replacing 1/999. I can see use case for alphabetical sort though (which should be similar to replace all rows in non-keyed, but perhaps we can find surprises in keyed implementations). I can see use case for replacing two adjacent rows. Can't see what's the use of swapping 1/999 as opposed to 4/9.

thysultan · 2017-11-23T08:25:16Z

A list of data could be sorted such that it is a single swap operation away from being sorted alphabetically, or generally any X operation – for example within a list that could be randomly ordered via user input of other variables there is an near-infinite number of order-states it could be in before a sort operation(i.e alphabetically) is applied to it. So switching the second and second-last falls within a real-world case.

That said the operations listed are an attempt to be a litmus test to give authors an idea of what the libs reconcile heuristics can guarantee and as such what areas it can improve. The underlining tone of this is that moving more DOM nodes than necessary affects performance but more so also affects DOM state.

eavichay · 2017-11-23T08:30:28Z

@thysultan Sorting alphabetically is a "real-world" use-case. It can be converted to random-shuffle the list (but keep the objects intact). This should make trouble for non-keyed but probably should be handled better by keyed implementations. In order to keep the tests fair - the shuffle algorithm should be implemented in the store object, and it makes sure ALL elements are swapped. I don't mind even if the swapping would be hard-coded to keep it fair.

thysultan · 2017-11-23T08:42:31Z

@eavichay Sorting alphabetically looks like it would put the vanillajs implementation closer to other libs(if it's a random-shuffle), but as an author i would loose the ability to pin-point/reproduce at what exact operation/pattern my lib is performing bad in.

eavichay · 2017-11-23T09:07:09Z

@thysultan I'm not sure what you're getting at. I am suggesting a test for "swap all rows - keep data". Perhaps an Array.reverse may do the trick, but I think it would be too easy to "cheat". Shuffling, then, would make sure the test is fair.

thysultan · 2017-11-23T09:28:36Z

@eavichay My concern is that if you can't reproduce a shuffle, there would be a lose of useful data on what steps lead to a particular result, data that a lib author could use to improve.

eavichay · 2017-11-23T09:40:20Z

@thysultan You can reproduce a shuffle (either using a constant seed which may cause too much javascript code) or simply writing a script that shuffles 1K items, produce the results and I could implement it hardcoded.
Another option is to reverse items in the list (in partitions of n-size)

thysultan · 2017-11-23T13:04:15Z

@eavichay Can you try this for a few frameworks and see what data picture is painted from that?

My only suggestion is to probably only ever move < 50% of the nodes in order to get a picture which libs preserve DOM state of the nodes that don't move, < 50% because the problem with "move everything" is that the concept of least number of DOM operations to get to reconciled state doesn't exist when everything has moved which conceptually would for the most part be the same for each framework.

leeoniya · 2017-12-11T08:13:49Z

after doing a healthy amount of uibench testing, i think @localvoid's "worst case" scenarios [1] are a good next step.

unlike uibench, this benchmark is BYO store implementation as well as needing a UI button for each test case. these requirements make updating all implementations a very substantial task.

if it was possible to avoid having to add a UI button & event binding for each additional metric, the task would still be tedious, but digestible. one way to do this would be to expose functions globally from each impl and have the bench runner invoke them as eg window.snabbdomWorstCase(). it's probably not sustainable to continue growing the UI for every possible store mutation added to this benchmark.

@krausest thoughts?

[1] https://github.com/localvoid/uibench-base/blob/efacae672bbf4133360a928305876ee1de643e64/lib/uibench.ts#L302-L319

Havunen · 2019-07-25T20:23:16Z

It would be nice to have @localvoid 's uibench "moves" added here into this benchmark.

krausest · 2020-12-13T20:40:53Z

I'm closing older issues that won't get fixed.

leeoniya mentioned this issue Nov 20, 2017

use 1/998 for swap rows rather than 4/9 #306

Closed

leeoniya mentioned this issue Nov 21, 2017

update impls to swap rows 1/998 instead of 4/9 #310

Merged

leeoniya mentioned this issue Nov 22, 2017

Is an imperative vanillajs a reasonable baseline for a data-driven bench? #316

Closed

krausest closed this as completed Dec 13, 2020

mohammedzamakhan mentioned this issue Dec 4, 2023

[Snyk] Fix for 3 vulnerabilities mohammedzamakhan/js-framework-benchmark#9

Open

Proposal for new tests(Moves) #305

Proposal for new tests(Moves) #305

Comments

thysultan commented Nov 19, 2017 • edited Loading

leeoniya commented Nov 19, 2017 • edited Loading

thysultan commented Nov 20, 2017

leeoniya commented Nov 20, 2017 • edited Loading

thysultan commented Nov 20, 2017 • edited Loading

leeoniya commented Nov 20, 2017

thysultan commented Nov 20, 2017 • edited Loading

thysultan commented Nov 20, 2017

leeoniya commented Nov 20, 2017 • edited Loading

thysultan commented Nov 20, 2017

leeoniya commented Nov 20, 2017

thysultan commented Nov 20, 2017

leeoniya commented Nov 20, 2017 • edited Loading

thysultan commented Nov 20, 2017

krausest commented Nov 20, 2017

leeoniya commented Nov 20, 2017

leeoniya commented Nov 21, 2017

thysultan commented Nov 21, 2017 • edited Loading

leeoniya commented Nov 21, 2017

thysultan commented Nov 21, 2017 • edited Loading

localvoid commented Nov 22, 2017 • edited Loading

leeoniya commented Nov 22, 2017

localvoid commented Nov 22, 2017 • edited Loading

thysultan commented Nov 22, 2017

localvoid commented Nov 22, 2017 • edited Loading

thysultan commented Nov 22, 2017 • edited Loading

leeoniya commented Nov 22, 2017

localvoid commented Nov 22, 2017 • edited Loading

thysultan commented Nov 22, 2017

thysultan commented Nov 22, 2017

leeoniya commented Nov 22, 2017 • edited Loading

eavichay commented Nov 23, 2017 • edited Loading

thysultan commented Nov 23, 2017

eavichay commented Nov 23, 2017 • edited Loading

thysultan commented Nov 23, 2017

eavichay commented Nov 23, 2017

thysultan commented Nov 23, 2017

eavichay commented Nov 23, 2017

thysultan commented Nov 23, 2017

leeoniya commented Dec 11, 2017

Havunen commented Jul 25, 2019

krausest commented Dec 13, 2020

thysultan commented Nov 19, 2017 •

edited

Loading

leeoniya commented Nov 19, 2017 •

edited

Loading

leeoniya commented Nov 20, 2017 •

edited

Loading

thysultan commented Nov 20, 2017 •

edited

Loading

thysultan commented Nov 20, 2017 •

edited

Loading

leeoniya commented Nov 20, 2017 •

edited

Loading

leeoniya commented Nov 20, 2017 •

edited

Loading

thysultan commented Nov 21, 2017 •

edited

Loading

thysultan commented Nov 21, 2017 •

edited

Loading

localvoid commented Nov 22, 2017 •

edited

Loading

localvoid commented Nov 22, 2017 •

edited

Loading

localvoid commented Nov 22, 2017 •

edited

Loading

thysultan commented Nov 22, 2017 •

edited

Loading

localvoid commented Nov 22, 2017 •

edited

Loading

leeoniya commented Nov 22, 2017 •

edited

Loading

eavichay commented Nov 23, 2017 •

edited

Loading

eavichay commented Nov 23, 2017 •

edited

Loading