mlx-engine
- Apple MLX LLM Engine for LM Studio
MLX engine for LM Studio
- mlx-lm - Apple MLX inference engine (MIT)
- Outlines - Structured output for LLMs (Apache 2.0)
- mlx-vlm - Vision model inferencing for MLX (MIT)
LM Studio 0.3.4 and newer for Mac ships pre-bundled with mlx-engine. Download LM Studio from here
- macOS 14.0 (Sonoma) or greater.
- python3.11
- The requirements.txt file is compiled specifically for python3.11. python3.11 is the python version bundled within the LM Studio MLX runtime
brew install [email protected]
is a quick way to add python3.11 to your path that doesn't break your default python setup
To run a demo of model load and inference:
- Clone the repository
git clone https://github.com/lmstudio-ai/mlx-engine.git
cd mlx-engine
- Create a virtual environment (optional)
python3.11 -m venv .venv
source .venv/bin/activate
- Install the required dependency packages
pip install -U -r requirements.txt
Download models with the lms
CLI tool. The lms
CLI documentation can be found here: https://lmstudio.ai/docs/cli
Run the demo.py
script with an MLX text generation model:
lms get mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
python demo.py --model ~/.cache/lm-studio/models/mlx-community/Meta-Llama-3.1-8B-Instruct-4bit
mlx-community/Meta-Llama-3.1-8B-Instruct-4bit - 4.53 GB
This command will use a default prompt that is formatted for Llama-3.1. For other models, add a custom --prompt
argument with the correct prompt formatting:
lms get mlx-community/Mistral-Small-Instruct-2409-4bit
python demo.py --model ~/.cache/lm-studio/models/mlx-community/Mistral-Small-Instruct-2409-4bit --prompt "<s> [INST] How long will it take for an apple to fall from a 10m tree? [/INST]"
mlx-community/Mistral-Small-Instruct-2409-4bit - 12.52 GB
Run the demo.py
script with an MLX vision model:
lms get mlx-community/pixtral-12b-4bit
python demo.py --model ~/.cache/lm-studio/models/mlx-community/pixtral-12b-4bit --prompt "<s>[INST]Compare these images[IMG][IMG][/INST]" --images demo-data/chameleon.webp demo-data/toucan.jpeg
Currently supported vision models include:
- Llama-3.2-Vision
lms get mlx-community/Llama-3.2-11B-Vision-Instruct-4bit
- Pixtral
lms get mlx-community/pixtral-12b-4bit
- Qwen2-VL
lms get mlx-community/Qwen2-VL-7B-Instruct-4bit
- Llava-v1.6
lms get mlx-community/llava-v1.6-mistral-7b-4bit
Run the demo.py
script with an MLX text generation model and a compatible --draft-model
lms get mlx-community/Qwen2.5-7B-Instruct-4bit
lms get lmstudio-community/Qwen2.5-0.5B-Instruct-MLX-8bit
python demo.py \
--model ~/.lmstudio/models/mlx-community/Qwen2.5-7B-Instruct-4bit \
--draft-model ~/.lmstudio/models/lmstudio-community/Qwen2.5-0.5B-Instruct-MLX-8bit \
--prompt "<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
Write a quick sort algorithm in C++<|im_end|>
<|im_start|>assistant
"
To run tests, run the following command from the root of this repo:
python -m unittest discover tests
To test specific vision models:
python -m unittest tests/test_vision_models.py -k pixtral