feat(channel): stream LLM responses to Telegram via draft message edits

Wire the existing provider-layer streaming infrastructure through the
channel trait and agent loop so Telegram users see tokens arrive
progressively via editMessageText, instead of waiting for the full
response.

Changes:
- Add StreamMode enum (off/partial/block) and draft_update_interval_ms
  to TelegramConfig (backward-compatible defaults: off, 1000ms)
- Add supports_draft_updates/send_draft/update_draft/finalize_draft to
  Channel trait with no-op defaults (zero impact on existing channels)
- Implement draft methods on TelegramChannel using sendMessage +
  editMessageText with rate limiting and Markdown fallback
- Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop
  (None preserves existing behavior)
- Wire streaming in process_channel_message: when channel supports
  drafts, send initial draft, spawn updater task, finalize on completion

Edge cases handled:
- 4096-char limit: finalize draft and fall back to chunked send
- Broken Markdown: use no parse_mode during streaming, apply on finalize
- Edit failures: fall back to sending complete response as new message
- Rate limiting: configurable draft_update_interval_ms (default 1s)
This commit is contained in:
Xiangjun Ma 2026-02-17 23:46:32 -08:00 committed by Chummy
parent a0b277b21e
commit 118cd53922
12 changed files with 410 additions and 43 deletions

View file

@ -790,6 +790,7 @@ pub(crate) async fn agent_turn(
None,
"channel",
max_tool_iterations,
None,
)
.await
}
@ -809,6 +810,7 @@ pub(crate) async fn run_tool_call_loop(
approval: Option<&ApprovalManager>,
channel_name: &str,
max_tool_iterations: usize,
on_delta: Option<tokio::sync::mpsc::Sender<String>>,
) -> Result<String> {
let max_iterations = if max_tool_iterations == 0 {
DEFAULT_MAX_TOOL_ITERATIONS
@ -938,7 +940,11 @@ pub(crate) async fn run_tool_call_loop(
};
if tool_calls.is_empty() {
// No tool calls — this is the final response
// No tool calls — this is the final response.
// If a streaming sender is provided, send the final text through it.
if let Some(ref tx) = on_delta {
let _ = tx.send(display_text.clone()).await;
}
history.push(ChatMessage::assistant(response_text.clone()));
return Ok(display_text);
}
@ -1358,6 +1364,7 @@ pub async fn run(
Some(&approval_manager),
"cli",
config.agent.max_tool_iterations,
None,
)
.await?;
final_output = response.clone();
@ -1483,6 +1490,7 @@ pub async fn run(
Some(&approval_manager),
"cli",
config.agent.max_tool_iterations,
None,
)
.await
{