From ccd87508995bf12835e7aee2ea4009b141ab96c6 Mon Sep 17 00:00:00 2001 From: Harald Hoyer Date: Thu, 21 May 2026 23:15:09 +0200 Subject: [PATCH] chore(halo): set spec-draft-p-min for coder model Add a 0.74 confidence threshold so speculative drafting stops early once the draft model's predicted token probability drops below it, favoring shorter, higher-acceptance draft sequences. --- systems/x86_64-linux/halo/models.ini | 1 + 1 file changed, 1 insertion(+) diff --git a/systems/x86_64-linux/halo/models.ini b/systems/x86_64-linux/halo/models.ini index 5b1bbac..bfc3fc9 100644 --- a/systems/x86_64-linux/halo/models.ini +++ b/systems/x86_64-linux/halo/models.ini @@ -18,6 +18,7 @@ c = 131072 hf = unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q8_K_XL spec-type = ngram-simple,draft-mtp spec-draft-n-max = 5 +spec-draft-p-min = 0.74 threads-batch = 16 temp = 0.6 top-p = 0.95