2.8 KiB
2.8 KiB
ZeroClaw Operations Runbook
This runbook is for operators who maintain availability, security posture, and incident response.
Last verified: February 18, 2026.
Scope
Use this document for day-2 operations:
- starting and supervising runtime
- health checks and diagnostics
- safe rollout and rollback
- incident triage and recovery
For first-time installation, start from one-click-bootstrap.md.
Runtime Modes
| Mode | Command | When to use |
|---|---|---|
| Foreground runtime | zeroclaw daemon |
local debugging, short-lived sessions |
| Foreground gateway only | zeroclaw gateway |
webhook endpoint testing |
| User service | zeroclaw service install && zeroclaw service start |
persistent operator-managed runtime |
Baseline Operator Checklist
- Validate configuration:
zeroclaw status
- Verify diagnostics:
zeroclaw doctor
zeroclaw channel doctor
- Start runtime:
zeroclaw daemon
- For persistent user session service:
zeroclaw service install
zeroclaw service start
zeroclaw service status
Health and State Signals
| Signal | Command / File | Expected |
|---|---|---|
| Config validity | zeroclaw doctor |
no critical errors |
| Channel connectivity | zeroclaw channel doctor |
configured channels healthy |
| Runtime summary | zeroclaw status |
expected provider/model/channels |
| Daemon heartbeat/state | ~/.zeroclaw/daemon_state.json |
file updates periodically |
Logs and Diagnostics
macOS / Windows (service wrapper logs)
~/.zeroclaw/logs/daemon.stdout.log~/.zeroclaw/logs/daemon.stderr.log
Linux (systemd user service)
journalctl --user -u zeroclaw.service -f
Incident Triage Flow (Fast Path)
- Snapshot system state:
zeroclaw status
zeroclaw doctor
zeroclaw channel doctor
- Check service state:
zeroclaw service status
- If service is unhealthy, restart cleanly:
zeroclaw service stop
zeroclaw service start
-
If channels still fail, verify allowlists and credentials in
~/.zeroclaw/config.toml. -
If gateway is involved, verify bind/auth settings (
[gateway]) and local reachability.
Safe Change Procedure
Before applying config changes:
- backup
~/.zeroclaw/config.toml - apply one logical change at a time
- run
zeroclaw doctor - restart daemon/service
- verify with
status+channel doctor
Rollback Procedure
If a rollout regresses behavior:
- restore previous
config.toml - restart runtime (
daemonorservice) - confirm recovery via
doctorand channel health checks - document incident root cause and mitigation