Preload Qwen3.6-27B and Qwen3.6-35B-A3B at startup (load-on-startup) so both are warm immediately under --models-max 2, set parallel = 1 as the [*] fallback for any other model, and adjust per-model context size and draft depth. |
||
|---|---|---|
| .. | ||
| amd | ||
| attic | ||
| halo | ||
| mx | ||
| nixtee1 | ||
| sgx | ||
| t15 | ||
| x1 | ||