model¶
Manage local LLM models. Models are stored under ~/.foil/models/ and registered in ~/.foil/models.json.
foil model¶
Manage LLM models.
Usage:
Options:
foil model activate¶
Activate a downloaded model. Restarts vllm-mlx if server is running.
Usage:
Options:
foil model delete¶
Delete a downloaded model.
Usage:
Options:
foil model download¶
Download a model from HuggingFace.
Usage:
Options:
foil model list¶
List downloaded models.
Usage:
Options:
Default model¶
Foil ships with mlx-community/Qwen2.5-Coder-7B-Instruct-4bit as the default. It's a ~4 GB 4-bit quantised code-specialised model that runs at ~50 tokens/s on M-series Macs.
Switching models¶
# Download an alternative
foil model download mlx-community/Qwen2.5-Coder-14B-Instruct-4bit
# Activate it for new scans
foil model activate Qwen2.5-Coder-14B-Instruct-4bit
# Verify
foil server status
Larger models are more accurate but need more unified memory (14B ≈ 8 GB, needs a 24 GB+ Mac). The engine restarts automatically on activation.