New wyoming-whisper-rs binary crate implementing the Wyoming protocol over TCP, making whisper-rs usable with Home Assistant's voice pipeline. Includes nix flake devshell with Vulkan, ROCm/hipBLAS, clippy, and rustfmt support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4.1 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
whisper-rs provides safe Rust bindings to whisper.cpp, a C++ speech recognition library. It's a two-crate workspace:
whisper-rs(root) — Safe public APIwhisper-rs-sys(sys/) — FFI bindings generated via bindgen, with a CMake-based build of the whisper.cpp submodule
The upstream C++ source lives in sys/whisper.cpp/ (git submodule — clone with --recursive).
Build Commands
cargo build # Default build
cargo build --release --features vulkan # With Vulkan GPU support
cargo build --release --features hipblas # With AMD ROCm support
cargo build --release --features cuda # With NVIDIA CUDA support
cargo test # Run all tests
cargo fmt # Format code
cargo clippy # Lint
Running examples (require a GGML model file and a WAV audio file):
cargo run --example basic_use -- model.bin audio.wav
cargo run --example audio_transcription -- model.bin audio.wav
cargo run --example vad -- model.bin audio.wav output.wav
Skipping bindgen (use pre-generated bindings): set WHISPER_DONT_GENERATE_BINDINGS=1.
All WHISPER_* and CMAKE_* env vars are forwarded to the CMake build.
Architecture
whisper-rs (safe Rust API)
→ whisper-rs-sys (bindgen FFI)
→ whisper.cpp (C++ submodule, built via CMake)
→ GGML (tensor library with CPU/GPU backends)
Key types and their relationships:
WhisperContext— Arc-wrapped model handle, thread-safe, created from a model fileWhisperState— Inference state created from a context; multiple states can share one contextFullParams— Transcription configuration (sampling strategy, language, callbacks)WhisperSegment/WhisperToken— Result types with timestamps and probabilities
Core flow: WhisperContext::new_with_params() → ctx.create_state() → state.full(params, &audio_data) → iterate segments via state.as_iter()
Module layout in src/:
whisper_ctx.rs— Raw context wrapper (WhisperInnerContext)whisper_ctx_wrapper.rs— Safe publicWhisperContextwhisper_state/— State management, segments, tokens, iteratorswhisper_params.rs—FullParams,SamplingStrategy(Greedy / BeamSearch)whisper_vad.rs— Voice Activity Detectionwhisper_grammar.rs— GBNF grammar-constrained decodingvulkan.rs— Vulkan device enumeration (behindvulkanfeature)
Build System (sys/build.rs)
The build script:
- Copies
whisper.cppsources intoOUT_DIR - Runs bindgen on
wrapper.hto generate FFI bindings (falls back tosrc/bindings.rson failure) - Configures and builds whisper.cpp via CMake with feature-dependent flags (
GGML_CUDA,GGML_HIP,GGML_VULKAN,GGML_METAL, etc.) - Statically links:
whisper,ggml,ggml-base,ggml-cpu, plus backend-specific libs
Feature Flags
| Feature | Purpose |
|---|---|
cuda |
NVIDIA GPU (needs CUDA toolkit) |
hipblas |
AMD GPU via ROCm |
metal |
Apple Metal GPU |
vulkan |
Vulkan GPU |
openblas |
OpenBLAS acceleration (requires BLAS_INCLUDE_DIRS env var) |
openmp |
OpenMP threading |
coreml |
Apple CoreML |
intel-sycl |
Intel SYCL |
raw-api |
Re-export whisper-rs-sys types publicly |
log_backend |
Route C++ logs to the log crate |
tracing_backend |
Route C++ logs to the tracing crate |
Nix Development Environment
The flake.nix provides a devshell with all dependencies for Vulkan and ROCm/hipBLAS builds. Use direnv allow or nix develop to enter the environment. Key env vars (LIBCLANG_PATH, BINDGEN_EXTRA_CLANG_ARGS, HIP_PATH, VULKAN_SDK) are set automatically.
PR Conventions
Per .github/PULL_REQUEST_TEMPLATE.md:
- Run
cargo fmtandcargo clippybefore submitting - Self-review code for legibility
- No GenAI-generated code in PRs