chore(halo): set spec-draft-p-min for coder model
Add a 0.74 confidence threshold so speculative drafting stops early once the draft model's predicted token probability drops below it, favoring shorter, higher-acceptance draft sequences.
This commit is contained in:
parent
3a070413e4
commit
ccd8750899
1 changed files with 1 additions and 0 deletions
|
|
@ -18,6 +18,7 @@ c = 131072
|
||||||
hf = unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q8_K_XL
|
hf = unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q8_K_XL
|
||||||
spec-type = ngram-simple,draft-mtp
|
spec-type = ngram-simple,draft-mtp
|
||||||
spec-draft-n-max = 5
|
spec-draft-n-max = 5
|
||||||
|
spec-draft-p-min = 0.74
|
||||||
threads-batch = 16
|
threads-batch = 16
|
||||||
temp = 0.6
|
temp = 0.6
|
||||||
top-p = 0.95
|
top-p = 0.95
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue