-
Notifications
You must be signed in to change notification settings - Fork 1
Issues: tenstorrent/tt-inference-server
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
File permissions for persistent_volume
bug
Something isn't working
#77
opened Jan 23, 2025 by
tstescoTT
setup.sh to check host for running out of disk space when setting up new model
enhancement
New feature or request
#76
opened Jan 22, 2025 by
tstescoTT
Setup.sh check and error message for HF 401 error when gated repo (e.g. Llama 3.x repos) does not have authorization.
enhancement
New feature or request
#75
opened Jan 21, 2025 by
tstescoTT
Verify device topology automatically
enhancement
New feature or request
#74
opened Jan 21, 2025 by
tstescoTT
MESH_DEVICE management for Llama 3.x implementations
enhancement
New feature or request
#73
opened Jan 21, 2025 by
tstescoTT
Add API authorization to YOLOv4 server
enhancement
New feature or request
#64
opened Dec 24, 2024 by
bgoelTT
Add health check to YOLOv4 server
enhancement
New feature or request
#63
opened Dec 24, 2024 by
bgoelTT
vLLM+LLama3.1-70B docker container built from scratch caused an exception in load_model
bug
Something isn't working
#62
opened Dec 24, 2024 by
yasuhiroitoTT
vLLM run script prefill + decode trace pre-capture to avoid TTFT on first completions being unexpectedly high or stalling
enhancement
New feature or request
#56
opened Dec 12, 2024 by
tstescoTT
Missing
--max_prompt_length
argument running example_requests_client_alpaca_eval.py
#51
opened Dec 2, 2024 by
milank94
Provide example chat template usage
documentation
Improvements or additions to documentation
enhancement
New feature or request
#36
opened Nov 15, 2024 by
tstescoTT
Add status messaging and endpoint to allow for client-side users to reason about model initialization and life cycle.
enhancement
New feature or request
#17
opened Sep 26, 2024 by
tstescoTT
Capture tt-metal and tt-NN loguru logs in inference server python log files
enhancement
New feature or request
#13
opened Sep 25, 2024 by
tstescoTT
ProTip!
Updated in the last three days: updated:>2025-01-21.