Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Multi-cluster LLM deployment #1379

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

TarasRudko
Copy link
Contributor

Description

This PR adds sample of using multi-cluster Gateway to route traffic to LLM inference servers deployed on clusters in multiple regions

Tasks

  • The contributing guide has been read and followed.
  • The samples added / modified have been fully tested.
  • Workflow files have been added / modified, if applicable.
  • Region tags have been properly added, if new samples.
  • All dependencies are set to up-to-date versions, as applicable.

@TarasRudko TarasRudko requested review from alizaidis, yoshi-approver and a team as code owners July 18, 2024 14:24
Copy link

snippet-bot bot commented Jul 18, 2024

Here is the summary of changes.

You are about to add 2 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@TarasRudko TarasRudko marked this pull request as draft July 19, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant