Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MESH_DEVICE management for Llama 3.x implementations #73

Open
tstescoTT opened this issue Jan 21, 2025 · 0 comments
Open

MESH_DEVICE management for Llama 3.x implementations #73

tstescoTT opened this issue Jan 21, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@tstescoTT
Copy link
Contributor

It's easy to misconfigure MESH_DEVICE environment variable and difficult for users to debug. Each model implementation may have different config for:

  1. known-good default MESH_DEVICE, and
  2. set of valid MESH_DEVICE configurations

Some explanation in https://github.com/tenstorrent/vllm/tree/dev/tt_metal#running-the-offline-inference-example

The model configs for tt-metal Llama 3.x model implementations use MESH_DEVICE to set key mesh_device settings:https://github.com/tenstorrent/tt-metal/blob/main/models/demos/llama3/tt/model_config.py#L84

@tstescoTT tstescoTT added the enhancement New feature or request label Jan 21, 2025
tstescoTT added a commit that referenced this issue Jan 27, 2025
## Change Log
- add Llama 3.2 Vision image input support in utils prompt generation and benchmarking script
- add MMMU with supprt from https://github.com/tstescoTT/lm-evaluation-harness/tree/tstesco/add-local-multimodal
- address #73 in run_vllm_api_server.py::ensure_mesh_device MESH_DEVICE handling
- fix #62 with run_vllm_api_server.py::register_vllm_models
tstescoTT added a commit that referenced this issue Jan 28, 2025
* Llama 3.x multimodal support for evaluations and benchmarking

## Change Log
- add Llama 3.2 Vision image input support in utils prompt generation and benchmarking script
- add MMMU with supprt from https://github.com/tstescoTT/lm-evaluation-harness/tree/tstesco/add-local-multimodal
- address #73 in run_vllm_api_server.py::ensure_mesh_device MESH_DEVICE handling
- fix #62 with run_vllm_api_server.py::register_vllm_models
- rename batch_size -> max_concurrent in client side scripts to indicate that they only set maximum concurrent requests and not the actual model batch size
- adding missing empty line at end of shell scripts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant