Preload Qwen3.6-27B and Qwen3.6-35B-A3B at startup (load-on-startup) so both are warm immediately under --models-max 2, set parallel = 1 as the [*] fallback for any other model, and adjust per-model context size and draft depth. |
||
|---|---|---|
| .. | ||
| aarch64-darwin | ||
| aarch64-linux | ||
| x86_64-darwin/mpro | ||
| x86_64-linux | ||
| nixbuild.nix | ||