Make sure that all workers are notified about end of execution loop #730
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently we will have a hang at the end of script when using TP>1 and multistep scheduling. This is caused by lack of notification from driver worker about ending the execution loop.
This is a workaround for this issue, by making sure that all workers are notified at the end of
llm_engine
loop.Other possible workaround could be modification of this check: https://github.com/HabanaAI/vllm-fork/blob/habana_main/vllm/engine/llm_engine.py#L1379 with
or not self.has_unfinished_requests()
.