fix(channels): harden whatsapp web mode and document dual backend

This commit is contained in:
Chummy 2026-02-20 16:31:27 +08:00
parent 70f12e5df9
commit d0674c4b98
7 changed files with 297 additions and 46 deletions

View file

@ -524,7 +524,37 @@ For non-text replies, ZeroClaw can send Telegram attachments when the assistant
Paths can be local files (for example `/tmp/screenshot.png`) or HTTPS URLs. Paths can be local files (for example `/tmp/screenshot.png`) or HTTPS URLs.
### WhatsApp Business Cloud API Setup ### WhatsApp Setup
ZeroClaw supports two WhatsApp backends:
- **WhatsApp Web mode** (QR / pair code, no Meta Business API required)
- **WhatsApp Business Cloud API mode** (official Meta webhook flow)
#### WhatsApp Web mode (recommended for personal/self-hosted use)
1. **Build with WhatsApp Web support:**
```bash
cargo build --features whatsapp-web
```
2. **Configure ZeroClaw:**
```toml
[channels_config.whatsapp]
session_path = "~/.zeroclaw/state/whatsapp-web/session.db"
pair_phone = "15551234567" # optional; omit to use QR flow
pair_code = "" # optional custom pair code
allowed_numbers = ["+1234567890"] # E.164 format, or ["*"] for all
```
3. **Start channels/daemon and link device:**
- Run `zeroclaw channel start` (or `zeroclaw daemon`).
- Follow terminal pairing output (QR or pair code).
- In WhatsApp on phone: **Settings → Linked Devices**.
4. **Test:** Send a message from an allowed number and verify the agent replies.
#### WhatsApp Business Cloud API mode
WhatsApp uses Meta's Cloud API with webhooks (push-based, not polling): WhatsApp uses Meta's Cloud API with webhooks (push-based, not polling):

View file

@ -101,7 +101,7 @@ If `[channels_config.matrix]` is present but the binary was built without `chann
| Mattermost | polling | No | | Mattermost | polling | No |
| Matrix | sync API (supports E2EE) | No | | Matrix | sync API (supports E2EE) | No |
| Signal | signal-cli HTTP bridge | No (local bridge endpoint) | | Signal | signal-cli HTTP bridge | No (local bridge endpoint) |
| WhatsApp | webhook | Yes (public HTTPS callback) | | WhatsApp | webhook (Cloud API) or websocket (Web mode) | Cloud API: Yes (public HTTPS callback), Web mode: No |
| Webhook | gateway endpoint (`/webhook`) | Usually yes | | Webhook | gateway endpoint (`/webhook`) | Usually yes |
| Email | IMAP polling + SMTP send | No | | Email | IMAP polling + SMTP send | No |
| IRC | IRC socket | No | | IRC | IRC socket | No |
@ -208,6 +208,13 @@ ignore_stories = true
### 4.7 WhatsApp ### 4.7 WhatsApp
ZeroClaw supports two WhatsApp backends:
- **Cloud API mode** (`phone_number_id` + `access_token` + `verify_token`)
- **WhatsApp Web mode** (`session_path`, requires build flag `--features whatsapp-web`)
Cloud API mode:
```toml ```toml
[channels_config.whatsapp] [channels_config.whatsapp]
access_token = "EAAB..." access_token = "EAAB..."
@ -217,6 +224,22 @@ app_secret = "your-app-secret" # optional but recommended
allowed_numbers = ["*"] allowed_numbers = ["*"]
``` ```
WhatsApp Web mode:
```toml
[channels_config.whatsapp]
session_path = "~/.zeroclaw/state/whatsapp-web/session.db"
pair_phone = "15551234567" # optional; omit to use QR flow
pair_code = "" # optional custom pair code
allowed_numbers = ["*"]
```
Notes:
- Build with `cargo build --features whatsapp-web` (or equivalent run command).
- Keep `session_path` on persistent storage to avoid relinking after restart.
- Reply routing uses the originating chat JID, so direct and group replies work correctly.
### 4.8 Webhook Channel Config (Gateway) ### 4.8 Webhook Channel Config (Gateway)
`channels_config.webhook` enables webhook-specific gateway behavior. `channels_config.webhook` enables webhook-specific gateway behavior.
@ -375,7 +398,7 @@ rg -n "Matrix|Telegram|Discord|Slack|Mattermost|Signal|WhatsApp|Email|IRC|Lark|D
| Mattermost | `Mattermost channel listening on` | `Mattermost: ignoring message from unauthorized user:` | `Mattermost poll error:` / `Mattermost parse error:` | | Mattermost | `Mattermost channel listening on` | `Mattermost: ignoring message from unauthorized user:` | `Mattermost poll error:` / `Mattermost parse error:` |
| Matrix | `Matrix channel listening on room` / `Matrix room ... is encrypted; E2EE decryption is enabled via matrix-sdk.` | `Matrix whoami failed; falling back to configured session hints for E2EE session restore:` / `Matrix whoami failed while resolving listener user_id; using configured user_id hint:` | `Matrix sync error: ... retrying...` | | Matrix | `Matrix channel listening on room` / `Matrix room ... is encrypted; E2EE decryption is enabled via matrix-sdk.` | `Matrix whoami failed; falling back to configured session hints for E2EE session restore:` / `Matrix whoami failed while resolving listener user_id; using configured user_id hint:` | `Matrix sync error: ... retrying...` |
| Signal | `Signal channel listening via SSE on` | (allowlist checks are enforced by `allowed_from`) | `Signal SSE returned ...` / `Signal SSE connect error:` | | Signal | `Signal channel listening via SSE on` | (allowlist checks are enforced by `allowed_from`) | `Signal SSE returned ...` / `Signal SSE connect error:` |
| WhatsApp (channel) | `WhatsApp channel active (webhook mode).` | `WhatsApp: ignoring message from unauthorized number:` | `WhatsApp send failed:` | | WhatsApp (channel) | `WhatsApp channel active (webhook mode).` / `WhatsApp Web connected successfully` | `WhatsApp: ignoring message from unauthorized number:` / `WhatsApp Web: message from ... not in allowed list` | `WhatsApp send failed:` / `WhatsApp Web stream error:` |
| Webhook / WhatsApp (gateway) | `WhatsApp webhook verified successfully` | `Webhook: rejected — not paired / invalid bearer token` / `Webhook: rejected request — invalid or missing X-Webhook-Secret` / `WhatsApp webhook verification failed — token mismatch` | `Webhook JSON parse error:` | | Webhook / WhatsApp (gateway) | `WhatsApp webhook verified successfully` | `Webhook: rejected — not paired / invalid bearer token` / `Webhook: rejected request — invalid or missing X-Webhook-Secret` / `WhatsApp webhook verification failed — token mismatch` | `Webhook JSON parse error:` |
| Email | `Email polling every ...` / `Email sent to ...` | `Blocked email from ...` | `Email poll failed:` / `Email poll task panicked:` | | Email | `Email polling every ...` / `Email sent to ...` | `Blocked email from ...` | `Email poll failed:` / `Email poll task panicked:` |
| IRC | `IRC channel connecting to ...` / `IRC registered as ...` | (allowlist checks are enforced by `allowed_users`) | `IRC SASL authentication failed (...)` / `IRC server does not support SASL...` / `IRC nickname ... is in use, trying ...` | | IRC | `IRC channel connecting to ...` / `IRC registered as ...` | (allowlist checks are enforced by `allowed_users`) | `IRC SASL authentication failed (...)` / `IRC server does not support SASL...` / `IRC nickname ... is in use, trying ...` |

View file

@ -378,6 +378,34 @@ Notes:
See detailed channel matrix and allowlist behavior in [channels-reference.md](channels-reference.md). See detailed channel matrix and allowlist behavior in [channels-reference.md](channels-reference.md).
### `[channels_config.whatsapp]`
WhatsApp supports two backends under one config table.
Cloud API mode (Meta webhook):
| Key | Required | Purpose |
|---|---|---|
| `access_token` | Yes | Meta Cloud API bearer token |
| `phone_number_id` | Yes | Meta phone number ID |
| `verify_token` | Yes | Webhook verification token |
| `app_secret` | Optional | Enables webhook signature verification (`X-Hub-Signature-256`) |
| `allowed_numbers` | Recommended | Allowed inbound numbers (`[]` = deny all, `"*"` = allow all) |
WhatsApp Web mode (native client):
| Key | Required | Purpose |
|---|---|---|
| `session_path` | Yes | Persistent SQLite session path |
| `pair_phone` | Optional | Pair-code flow phone number (digits only) |
| `pair_code` | Optional | Custom pair code (otherwise auto-generated) |
| `allowed_numbers` | Recommended | Allowed inbound numbers (`[]` = deny all, `"*"` = allow all) |
Notes:
- WhatsApp Web requires build flag `whatsapp-web`.
- If both Cloud and Web fields are present, Cloud mode wins for backward compatibility.
## `[hardware]` ## `[hardware]`
Hardware wizard configuration for physical-world access (STM32, probe, serial). Hardware wizard configuration for physical-world access (STM32, probe, serial).

View file

@ -2073,6 +2073,11 @@ pub async fn doctor_channels(config: Config) -> Result<()> {
} }
if let Some(ref wa) = config.channels_config.whatsapp { if let Some(ref wa) = config.channels_config.whatsapp {
if wa.is_ambiguous_config() {
tracing::warn!(
"WhatsApp config has both phone_number_id and session_path set; preferring Cloud API mode. Remove one selector to avoid ambiguity."
);
}
// Runtime negotiation: detect backend type from config // Runtime negotiation: detect backend type from config
match wa.backend_type() { match wa.backend_type() {
"cloud" => { "cloud" => {
@ -2462,6 +2467,11 @@ pub async fn start_channels(config: Config) -> Result<()> {
} }
if let Some(ref wa) = config.channels_config.whatsapp { if let Some(ref wa) = config.channels_config.whatsapp {
if wa.is_ambiguous_config() {
tracing::warn!(
"WhatsApp config has both phone_number_id and session_path set; preferring Cloud API mode. Remove one selector to avoid ambiguity."
);
}
// Runtime negotiation: detect backend type from config // Runtime negotiation: detect backend type from config
match wa.backend_type() { match wa.backend_type() {
"cloud" => { "cloud" => {

View file

@ -15,7 +15,7 @@
//! # Configuration //! # Configuration
//! //!
//! ```toml //! ```toml
//! [channels.whatsapp] //! [channels_config.whatsapp]
//! session_path = "~/.zeroclaw/whatsapp-session.db" # Required for Web mode //! session_path = "~/.zeroclaw/whatsapp-session.db" # Required for Web mode
//! pair_phone = "15551234567" # Optional: for pair code linking //! pair_phone = "15551234567" # Optional: for pair code linking
//! allowed_numbers = ["+1234567890", "*"] # Same as Cloud API //! allowed_numbers = ["+1234567890", "*"] # Same as Cloud API
@ -43,7 +43,7 @@ use tokio::select;
/// # Configuration /// # Configuration
/// ///
/// ```toml /// ```toml
/// [channels.whatsapp] /// [channels_config.whatsapp]
/// session_path = "~/.zeroclaw/whatsapp-session.db" /// session_path = "~/.zeroclaw/whatsapp-session.db"
/// pair_phone = "15551234567" # Optional /// pair_phone = "15551234567" # Optional
/// allowed_numbers = ["+1234567890", "*"] /// allowed_numbers = ["+1234567890", "*"]
@ -96,8 +96,7 @@ impl WhatsAppWebChannel {
/// Check if a phone number is allowed (E.164 format: +1234567890) /// Check if a phone number is allowed (E.164 format: +1234567890)
#[cfg(feature = "whatsapp-web")] #[cfg(feature = "whatsapp-web")]
fn is_number_allowed(&self, phone: &str) -> bool { fn is_number_allowed(&self, phone: &str) -> bool {
self.allowed_numbers.is_empty() self.allowed_numbers.iter().any(|n| n == "*" || n == phone)
|| self.allowed_numbers.iter().any(|n| n == "*" || n == phone)
} }
/// Normalize phone number to E.164 format /// Normalize phone number to E.164 format
@ -116,6 +115,12 @@ impl WhatsAppWebChannel {
} }
} }
/// Whether the recipient string is a WhatsApp JID (contains a domain suffix).
#[cfg(feature = "whatsapp-web")]
fn is_jid(recipient: &str) -> bool {
recipient.trim().contains('@')
}
/// Convert a recipient to a wa-rs JID. /// Convert a recipient to a wa-rs JID.
/// ///
/// Supports: /// Supports:
@ -156,14 +161,16 @@ impl Channel for WhatsAppWebChannel {
anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first."); anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first.");
}; };
// Validate recipient is allowed // Validate recipient allowlist only for direct phone-number targets.
let normalized = self.normalize_phone(&message.recipient); if !Self::is_jid(&message.recipient) {
if !self.is_number_allowed(&normalized) { let normalized = self.normalize_phone(&message.recipient);
tracing::warn!( if !self.is_number_allowed(&normalized) {
"WhatsApp Web: recipient {} not in allowed list", tracing::warn!(
message.recipient "WhatsApp Web: recipient {} not in allowed list",
); message.recipient
return Ok(()); );
return Ok(());
}
} }
let to = self.recipient_to_jid(&message.recipient)?; let to = self.recipient_to_jid(&message.recipient)?;
@ -246,7 +253,12 @@ impl Channel for WhatsAppWebChannel {
let sender = info.source.sender.user().to_string(); let sender = info.source.sender.user().to_string();
let chat = info.source.chat.to_string(); let chat = info.source.chat.to_string();
tracing::info!("📨 WhatsApp message from {} in {}: {}", sender, chat, text); tracing::info!(
"WhatsApp Web message from {} in {}: {}",
sender,
chat,
text
);
// Check if sender is allowed // Check if sender is allowed
let normalized = if sender.starts_with('+') { let normalized = if sender.starts_with('+') {
@ -255,17 +267,26 @@ impl Channel for WhatsAppWebChannel {
format!("+{sender}") format!("+{sender}")
}; };
if allowed_numbers.is_empty() if allowed_numbers.iter().any(|n| n == "*" || n == &normalized) {
|| allowed_numbers.iter().any(|n| n == "*" || n == &normalized) let trimmed = text.trim();
{ if trimmed.is_empty() {
tracing::debug!(
"WhatsApp Web: ignoring empty or non-text message from {}",
normalized
);
return;
}
if let Err(e) = tx_inner if let Err(e) = tx_inner
.send(ChannelMessage { .send(ChannelMessage {
id: uuid::Uuid::new_v4().to_string(), id: uuid::Uuid::new_v4().to_string(),
channel: "whatsapp".to_string(), channel: "whatsapp".to_string(),
sender: normalized.clone(), sender: normalized.clone(),
reply_target: normalized.clone(), // Reply to the originating chat JID (DM or group).
content: text.to_string(), reply_target: chat,
timestamp: chrono::Utc::now().timestamp_millis() as u64, content: trimmed.to_string(),
timestamp: chrono::Utc::now().timestamp() as u64,
thread_ts: None,
}) })
.await .await
{ {
@ -276,20 +297,24 @@ impl Channel for WhatsAppWebChannel {
} }
} }
Event::Connected(_) => { Event::Connected(_) => {
tracing::info!("WhatsApp Web connected successfully!"); tracing::info!("WhatsApp Web connected successfully");
} }
Event::LoggedOut(_) => { Event::LoggedOut(_) => {
tracing::warn!("WhatsApp Web was logged out!"); tracing::warn!("WhatsApp Web was logged out");
} }
Event::StreamError(stream_error) => { Event::StreamError(stream_error) => {
tracing::error!("WhatsApp Web stream error: {:?}", stream_error); tracing::error!("WhatsApp Web stream error: {:?}", stream_error);
} }
Event::PairingCode { code, .. } => { Event::PairingCode { code, .. } => {
tracing::info!("🔑 Pair code received: {}", code); tracing::info!("WhatsApp Web pair code received: {}", code);
tracing::info!("Link your phone by entering this code in WhatsApp > Linked Devices"); tracing::info!(
"Link your phone by entering this code in WhatsApp > Linked Devices"
);
} }
Event::PairingQrCode { code, .. } => { Event::PairingQrCode { code, .. } => {
tracing::info!("📱 QR code received (scan with WhatsApp > Linked Devices)"); tracing::info!(
"WhatsApp Web QR code received (scan with WhatsApp > Linked Devices)"
);
tracing::debug!("QR code: {}", code); tracing::debug!("QR code: {}", code);
} }
_ => {} _ => {}
@ -352,13 +377,15 @@ impl Channel for WhatsAppWebChannel {
anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first."); anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first.");
}; };
let normalized = self.normalize_phone(recipient); if !Self::is_jid(recipient) {
if !self.is_number_allowed(&normalized) { let normalized = self.normalize_phone(recipient);
tracing::warn!( if !self.is_number_allowed(&normalized) {
"WhatsApp Web: typing target {} not in allowed list", tracing::warn!(
recipient "WhatsApp Web: typing target {} not in allowed list",
); recipient
return Ok(()); );
return Ok(());
}
} }
let to = self.recipient_to_jid(recipient)?; let to = self.recipient_to_jid(recipient)?;
@ -378,13 +405,15 @@ impl Channel for WhatsAppWebChannel {
anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first."); anyhow::bail!("WhatsApp Web client not connected. Initialize the bot first.");
}; };
let normalized = self.normalize_phone(recipient); if !Self::is_jid(recipient) {
if !self.is_number_allowed(&normalized) { let normalized = self.normalize_phone(recipient);
tracing::warn!( if !self.is_number_allowed(&normalized) {
"WhatsApp Web: typing target {} not in allowed list", tracing::warn!(
recipient "WhatsApp Web: typing target {} not in allowed list",
); recipient
return Ok(()); );
return Ok(());
}
} }
let to = self.recipient_to_jid(recipient)?; let to = self.recipient_to_jid(recipient)?;
@ -498,8 +527,8 @@ mod tests {
#[cfg(feature = "whatsapp-web")] #[cfg(feature = "whatsapp-web")]
fn whatsapp_web_number_denied_empty() { fn whatsapp_web_number_denied_empty() {
let ch = WhatsAppWebChannel::new("/tmp/test.db".into(), None, None, vec![]); let ch = WhatsAppWebChannel::new("/tmp/test.db".into(), None, None, vec![]);
// Empty allowed_numbers means "allow all" (same behavior as Cloud API) // Empty allowlist means "deny all" (matches channel-wide allowlist policy).
assert!(ch.is_number_allowed("+1234567890")); assert!(!ch.is_number_allowed("+1234567890"));
} }
#[test] #[test]
@ -516,6 +545,16 @@ mod tests {
assert_eq!(ch.normalize_phone("+1234567890"), "+1234567890"); assert_eq!(ch.normalize_phone("+1234567890"), "+1234567890");
} }
#[test]
#[cfg(feature = "whatsapp-web")]
fn whatsapp_web_normalize_phone_from_jid() {
let ch = make_channel();
assert_eq!(
ch.normalize_phone("1234567890@s.whatsapp.net"),
"+1234567890"
);
}
#[tokio::test] #[tokio::test]
#[cfg(feature = "whatsapp-web")] #[cfg(feature = "whatsapp-web")]
async fn whatsapp_web_health_check_disconnected() { async fn whatsapp_web_health_check_disconnected() {

View file

@ -2461,6 +2461,13 @@ impl WhatsAppConfig {
pub fn is_web_config(&self) -> bool { pub fn is_web_config(&self) -> bool {
self.session_path.is_some() self.session_path.is_some()
} }
/// Returns true when both Cloud and Web selectors are present.
///
/// Runtime currently prefers Cloud mode in this case for backward compatibility.
pub fn is_ambiguous_config(&self) -> bool {
self.phone_number_id.is_some() && self.session_path.is_some()
}
} }
/// IRC channel configuration. /// IRC channel configuration.
@ -4458,6 +4465,38 @@ channel_id = "C123"
assert_eq!(parsed.allowed_numbers, vec!["*"]); assert_eq!(parsed.allowed_numbers, vec!["*"]);
} }
#[test]
async fn whatsapp_config_backend_type_cloud_precedence_when_ambiguous() {
let wc = WhatsAppConfig {
access_token: Some("tok".into()),
phone_number_id: Some("123".into()),
verify_token: Some("ver".into()),
app_secret: None,
session_path: Some("~/.zeroclaw/state/whatsapp-web/session.db".into()),
pair_phone: None,
pair_code: None,
allowed_numbers: vec!["+1".into()],
};
assert!(wc.is_ambiguous_config());
assert_eq!(wc.backend_type(), "cloud");
}
#[test]
async fn whatsapp_config_backend_type_web() {
let wc = WhatsAppConfig {
access_token: None,
phone_number_id: None,
verify_token: None,
app_secret: None,
session_path: Some("~/.zeroclaw/state/whatsapp-web/session.db".into()),
pair_phone: None,
pair_code: None,
allowed_numbers: vec![],
};
assert!(!wc.is_ambiguous_config());
assert_eq!(wc.backend_type(), "web");
}
#[test] #[test]
async fn channels_config_with_whatsapp() { async fn channels_config_with_whatsapp() {
let c = ChannelsConfig { let c = ChannelsConfig {

View file

@ -3238,10 +3238,92 @@ fn setup_channels() -> Result<ChannelsConfig> {
ChannelMenuChoice::WhatsApp => { ChannelMenuChoice::WhatsApp => {
// ── WhatsApp ── // ── WhatsApp ──
println!(); println!();
println!(" {}", style("WhatsApp Setup").white().bold());
let mode_options = vec![
"WhatsApp Web (QR / pair-code, no Meta Business API)",
"WhatsApp Business Cloud API (webhook)",
];
let mode_idx = Select::new()
.with_prompt(" Choose WhatsApp mode")
.items(&mode_options)
.default(0)
.interact()?;
if mode_idx == 0 {
println!(" {}", style("Mode: WhatsApp Web").dim());
print_bullet("1. Build with --features whatsapp-web");
print_bullet(
"2. Start channel/daemon and scan QR in WhatsApp > Linked Devices",
);
print_bullet("3. Keep session_path persistent so relogin is not required");
println!();
let session_path: String = Input::new()
.with_prompt(" Session database path")
.default("~/.zeroclaw/state/whatsapp-web/session.db".into())
.interact_text()?;
if session_path.trim().is_empty() {
println!(" {} Skipped — session path required", style("").dim());
continue;
}
let pair_phone: String = Input::new()
.with_prompt(
" Pair phone (optional, digits only; leave empty to use QR flow)",
)
.allow_empty(true)
.interact_text()?;
let pair_code: String = if pair_phone.trim().is_empty() {
String::new()
} else {
Input::new()
.with_prompt(
" Custom pair code (optional, leave empty for auto-generated)",
)
.allow_empty(true)
.interact_text()?
};
let users_str: String = Input::new()
.with_prompt(
" Allowed phone numbers (comma-separated +1234567890, or * for all)",
)
.default("*".into())
.interact_text()?;
let allowed_numbers = if users_str.trim() == "*" {
vec!["*".into()]
} else {
users_str.split(',').map(|s| s.trim().to_string()).collect()
};
config.whatsapp = Some(WhatsAppConfig {
access_token: None,
phone_number_id: None,
verify_token: None,
app_secret: None,
session_path: Some(session_path.trim().to_string()),
pair_phone: (!pair_phone.trim().is_empty())
.then(|| pair_phone.trim().to_string()),
pair_code: (!pair_code.trim().is_empty())
.then(|| pair_code.trim().to_string()),
allowed_numbers,
});
println!(
" {} WhatsApp Web configuration saved.",
style("").green().bold()
);
continue;
}
println!( println!(
" {} {}", " {} {}",
style("WhatsApp Setup").white().bold(), style("Mode:").dim(),
style("— Business Cloud API").dim() style("Business Cloud API").dim()
); );
print_bullet("1. Go to developers.facebook.com and create a WhatsApp app"); print_bullet("1. Go to developers.facebook.com and create a WhatsApp app");
print_bullet("2. Add the WhatsApp product and get your phone number ID"); print_bullet("2. Add the WhatsApp product and get your phone number ID");