-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial Map-Reduce #491
Partial Map-Reduce #491
Commits on Sep 23, 2024
-
test: use vardir for repository copying
Upgrade-tests use git to clone the current repository and check the necessary versions out. The cloned repository was always saved in vshard/test/var, even when the actual --var argument for test-run.py was /tmp/var. Lets better make the copied code also stored by the path given in --var. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for cea68f3 - Browse repository at this point
Copy the full SHA cea68f3View commit details
Commits on Oct 10, 2024
-
test: enable strict mode in Lua
It makes the unknown variables treated as errors, not as 'nil's. Otherwise it is too easy to use a wrong variable name somewhere and get it as nil, and the tests would even pass, but not test what they are supposed to. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for c437f69 - Browse repository at this point
Copy the full SHA c437f69View commit details -
router: extract a couple of map-reduce helpers
The only map-reduce function is router.map_callrw(). At some point there was a task to introduce a new mode of Map-Reduce - partial, by bucket IDs, not on the whole cluster. For that task was introduced a new function router.map_part_callrw() which had the same Map-Reduce and error handling stages. Only Ref stage was different. The new helpers in this commit were supposed to reuse some code between those two map-call functions. Later it was decided to leave just one map-call function and add a new option to it. But these new helpers still look useful to have as separate functions. They make the map-call function really small and simple. NO_DOC=internal NO_TEST=refactoring
Configuration menu - View commit details
-
Copy full SHA for 5e3ebb4 - Browse repository at this point
Copy the full SHA 5e3ebb4View commit details -
storage: introduce ref.check()
It ensures the ref is still in place. A read-only operation. It is going to be used in the future commits about partial map-reduce. Router will be going potentially more than once to some storages and at all times it would ensure the ref is still in place. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for 165dc18 - Browse repository at this point
Copy the full SHA 165dc18View commit details -
storage: implement partial map-reduce API
The storage-size of the Partial Map-Reduce feature. NO_DOC=later
Configuration menu - View commit details
-
Copy full SHA for b743601 - Browse repository at this point
Copy the full SHA b743601View commit details -
Make the tests pass from prev commit
The previous commit was failing some tests. Lets patch them up. That commit isn't amended so as to keep its original shape in respect to the external contributor. NO_DOC=bugfix
Configuration menu - View commit details
-
Copy full SHA for 87aefc7 - Browse repository at this point
Copy the full SHA 87aefc7View commit details -
router: implement partial map-reduce API
Introduce a partial ref-map-reduce API for vshard. It guarantees that in case of success the function is executed exactly once on the storages, that contain the given list of buckets. NO_DOC=later
Configuration menu - View commit details
-
Copy full SHA for 5ab5b4c - Browse repository at this point
Copy the full SHA 5ab5b4cView commit details -
Make the tests pass from prev commit
The previous commit was failing some tests. Lets patch them up. That commit isn't amended so as to keep its original shape in respect to the external contributor. NO_DOC=bugfix
Configuration menu - View commit details
-
Copy full SHA for 8c95293 - Browse repository at this point
Copy the full SHA 8c95293View commit details -
router: improve master connection parallelism in map-reduce
This is useful for RW map-reduce requests which need to send multiple network requests in parallel to multiple masters. In-parallel means using is_async netbox feature. But it only works if the connection is already established. Which means that the connection establishment ideally must also be parallel. NO_DOC=internal NO_TEST=already covered
Configuration menu - View commit details
-
Copy full SHA for ff8e8a0 - Browse repository at this point
Copy the full SHA ff8e8a0View commit details -
Review fixes for Partial Map-Reduce
There were a number of minor issues with the previous several commits, like the tests running way too long or some cases not being covered or the code being non-critically suboptimal. Lets fix them all. The original commits aren't amended so as to keep their original shape in respect to the external contributor. NO_DOC=bugfix
Configuration menu - View commit details
-
Copy full SHA for 7ff139a - Browse repository at this point
Copy the full SHA 7ff139aView commit details -
router: move Ref stage of Map-Reduce into new func
There are 2 Ref-Map-Reduce functions right now - map_callrw() and map_part_callrw(). Their only difference is that the former refs the whole cluster, while the latter refs only a subset of storages. The rest is the same. There is an idea, that better lets merge these functions into one and make the bucket IDs an option. The commit extracts the Ref stages of both functions into separate helpers which will allow to keep this future single function very short and simple. NO_DOC=internal NO_TEST=refactoring
Configuration menu - View commit details
-
Copy full SHA for b2c3c6e - Browse repository at this point
Copy the full SHA b2c3c6eView commit details -
router: merge map_callrw and map_part_callrw
The behavior is regulated with the new bucket_ids option. @TarantoolBot document Title: vshard: `bucket_ids` option for `router.map_callrw()` The option is an array of numeric bucket IDs. When specified, the Ref-Map-Reduce is only performed on the masters having at least one of these buckets. By default all the stages are done on all masters in the cluster. Example: ```Lua -- Assume buckets 1, 2, 3 cover replicasets UUID_A and UUID_B. res, err = vshard.router.map_callrw(func, args, {bucket_ids = {1, 2, 3}}) assert(res[UUID_A] == {func_result_from_A}) assert(res[UUID_B] == {func_result_from_B}) ```
Configuration menu - View commit details
-
Copy full SHA for c72abb9 - Browse repository at this point
Copy the full SHA c72abb9View commit details -
test: move test_map_callrw_raw() to another file
Lets merge all map_callrw() tests into a single file. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for 59deb28 - Browse repository at this point
Copy the full SHA 59deb28View commit details -
test: +1 replicaset to map_callrw() tests
When there were only 2, all cases would either cover a single replicaset or "all" of them. Lets make them 3, so that some tests actually cover a part of a cluster which is not just a single replicaset. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for ca39cac - Browse repository at this point
Copy the full SHA ca39cacView commit details -
storage: fix moved buckets check
'moved_buckets' function would treat as "moved" all the buckets which are not strictly ACTIVE. But that isn't optimal. Also the 'moved_buckets' func would assume that when ref creation is started, by the end of it the buckets stay unchanged. That isn't true. Thirdly, the moved buckets could contain the destination where did they move to. Returning this to the router would make the re-discovery faster. Fourthly, PINNED buckets were not considered ACTIVE. The commit fixes all these issues. Firstly, when a bucket is SENDING, returning an error right away isn't good. The router would just keep retrying then, without any progress. The bucket is actually here, it is not moved yet. Better let the storage try to take a ref. Then one of 2 results are possible: - It waits without useless active retries. And then SENDING fails and becomes ACTIVE. Ref is taken, all good. - It waits without useless active retries. SENDING turns into SENT. Ref is taken for the other buckets, and this one is reported as moved. Similar logic applies to RECEIVING. Secondly, after a ref is taken, the not-moved buckets could become moved. Need to re-check them before returning the ref. Luckily, the storage can use bucket_generation to avoid this double-check when nothing changed in _bucket. NO_DOC=bugfix
Configuration menu - View commit details
-
Copy full SHA for ead3770 - Browse repository at this point
Copy the full SHA ead3770View commit details -
storage: fix moved buckets ref check
During the partial Map-Reduce the router might visit some storages more than once. Happens when after a ref on storage-A another storage-B reports A as having taken some buckets. Then router would come back to A to confirm that. The storage still must hold its previously created ref in order for such checks to make any sense. Otherwise any of the previously confirmed buckets could have had escaped by now. Without the ref-checking the router could reach the Map stage and send some Map requests even though could detect earlier, that not all storages would succeed. This wasn't strictly speaking a bug, but it was clearly suboptimal behaviour leading to the requests being executed not on all the needed storages while the others would report errors. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for f316ad1 - Browse repository at this point
Copy the full SHA f316ad1View commit details -
test: rename map_part_test to map_callrw_test
It tests not only partial Map-Reduce. It covers a bit of the full one as well. NO_DOC=internal
Configuration menu - View commit details
-
Copy full SHA for 08aa380 - Browse repository at this point
Copy the full SHA 08aa380View commit details