wyoming-whisper-rs/CLAUDE.md
Harald Hoyer 50fdb08a38 Add Wyoming protocol ASR server and nix devshell
New wyoming-whisper-rs binary crate implementing the Wyoming protocol
over TCP, making whisper-rs usable with Home Assistant's voice pipeline.
Includes nix flake devshell with Vulkan, ROCm/hipBLAS, clippy, and
rustfmt support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 11:44:03 +01:00

4.1 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

whisper-rs provides safe Rust bindings to whisper.cpp, a C++ speech recognition library. It's a two-crate workspace:

  • whisper-rs (root) — Safe public API
  • whisper-rs-sys (sys/) — FFI bindings generated via bindgen, with a CMake-based build of the whisper.cpp submodule

The upstream C++ source lives in sys/whisper.cpp/ (git submodule — clone with --recursive).

Build Commands

cargo build                                    # Default build
cargo build --release --features vulkan        # With Vulkan GPU support
cargo build --release --features hipblas       # With AMD ROCm support
cargo build --release --features cuda          # With NVIDIA CUDA support
cargo test                                     # Run all tests
cargo fmt                                      # Format code
cargo clippy                                   # Lint

Running examples (require a GGML model file and a WAV audio file):

cargo run --example basic_use -- model.bin audio.wav
cargo run --example audio_transcription -- model.bin audio.wav
cargo run --example vad -- model.bin audio.wav output.wav

Skipping bindgen (use pre-generated bindings): set WHISPER_DONT_GENERATE_BINDINGS=1.

All WHISPER_* and CMAKE_* env vars are forwarded to the CMake build.

Architecture

whisper-rs (safe Rust API)
  → whisper-rs-sys (bindgen FFI)
    → whisper.cpp (C++ submodule, built via CMake)
      → GGML (tensor library with CPU/GPU backends)

Key types and their relationships:

  • WhisperContext — Arc-wrapped model handle, thread-safe, created from a model file
  • WhisperState — Inference state created from a context; multiple states can share one context
  • FullParams — Transcription configuration (sampling strategy, language, callbacks)
  • WhisperSegment / WhisperToken — Result types with timestamps and probabilities

Core flow: WhisperContext::new_with_params()ctx.create_state()state.full(params, &audio_data) → iterate segments via state.as_iter()

Module layout in src/:

  • whisper_ctx.rs — Raw context wrapper (WhisperInnerContext)
  • whisper_ctx_wrapper.rs — Safe public WhisperContext
  • whisper_state/ — State management, segments, tokens, iterators
  • whisper_params.rsFullParams, SamplingStrategy (Greedy / BeamSearch)
  • whisper_vad.rs — Voice Activity Detection
  • whisper_grammar.rs — GBNF grammar-constrained decoding
  • vulkan.rs — Vulkan device enumeration (behind vulkan feature)

Build System (sys/build.rs)

The build script:

  1. Copies whisper.cpp sources into OUT_DIR
  2. Runs bindgen on wrapper.h to generate FFI bindings (falls back to src/bindings.rs on failure)
  3. Configures and builds whisper.cpp via CMake with feature-dependent flags (GGML_CUDA, GGML_HIP, GGML_VULKAN, GGML_METAL, etc.)
  4. Statically links: whisper, ggml, ggml-base, ggml-cpu, plus backend-specific libs

Feature Flags

Feature Purpose
cuda NVIDIA GPU (needs CUDA toolkit)
hipblas AMD GPU via ROCm
metal Apple Metal GPU
vulkan Vulkan GPU
openblas OpenBLAS acceleration (requires BLAS_INCLUDE_DIRS env var)
openmp OpenMP threading
coreml Apple CoreML
intel-sycl Intel SYCL
raw-api Re-export whisper-rs-sys types publicly
log_backend Route C++ logs to the log crate
tracing_backend Route C++ logs to the tracing crate

Nix Development Environment

The flake.nix provides a devshell with all dependencies for Vulkan and ROCm/hipBLAS builds. Use direnv allow or nix develop to enter the environment. Key env vars (LIBCLANG_PATH, BINDGEN_EXTRA_CLANG_ARGS, HIP_PATH, VULKAN_SDK) are set automatically.

PR Conventions

Per .github/PULL_REQUEST_TEMPLATE.md:

  • Run cargo fmt and cargo clippy before submitting
  • Self-review code for legibility
  • No GenAI-generated code in PRs