You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
model=meta-llama/Meta-Llama-3.1-8B-Instruct
volume=$PWD/data
docker run -d --gpus all --env-file .env --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/predibase/lorax:main --model-id $model
See this error after model gets downloaded
'FlashLlamaAttention' object has no attribute 'fp8_kv'
Expected behavior
webserver should spin. It worked with older version. There seems to be problem after the new commit went into main
The text was updated successfully, but these errors were encountered:
System Info
amazon linux 2
Running it in l40s
Information
Tasks
Reproduction
model=meta-llama/Meta-Llama-3.1-8B-Instruct
volume=$PWD/data
docker run -d --gpus all --env-file .env --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/predibase/lorax:main --model-id $model
See this error after model gets downloaded
Expected behavior
webserver should spin. It worked with older version. There seems to be problem after the new commit went into main
The text was updated successfully, but these errors were encountered: