Harald Hoyer 50fdb08a38 Add Wyoming protocol ASR server and nix devshell

New wyoming-whisper-rs binary crate implementing the Wyoming protocol
over TCP, making whisper-rs usable with Home Assistant's voice pipeline.
Includes nix flake devshell with Vulkan, ROCm/hipBLAS, clippy, and
rustfmt support.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-24 11:44:03 +01:00

4.1 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

whisper-rs provides safe Rust bindings to whisper.cpp, a C++ speech recognition library. It's a two-crate workspace:

whisper-rs (root) — Safe public API
whisper-rs-sys (sys/) — FFI bindings generated via bindgen, with a CMake-based build of the whisper.cpp submodule

The upstream C++ source lives in sys/whisper.cpp/ (git submodule — clone with --recursive).

Build Commands

cargo build                                    # Default build
cargo build --release --features vulkan        # With Vulkan GPU support
cargo build --release --features hipblas       # With AMD ROCm support
cargo build --release --features cuda          # With NVIDIA CUDA support
cargo test                                     # Run all tests
cargo fmt                                      # Format code
cargo clippy                                   # Lint

Running examples (require a GGML model file and a WAV audio file):

cargo run --example basic_use -- model.bin audio.wav
cargo run --example audio_transcription -- model.bin audio.wav
cargo run --example vad -- model.bin audio.wav output.wav

Skipping bindgen (use pre-generated bindings): set WHISPER_DONT_GENERATE_BINDINGS=1.

All WHISPER_* and CMAKE_* env vars are forwarded to the CMake build.

Architecture

whisper-rs (safe Rust API)
  → whisper-rs-sys (bindgen FFI)
    → whisper.cpp (C++ submodule, built via CMake)
      → GGML (tensor library with CPU/GPU backends)

Key types and their relationships:

WhisperContext — Arc-wrapped model handle, thread-safe, created from a model file
WhisperState — Inference state created from a context; multiple states can share one context
FullParams — Transcription configuration (sampling strategy, language, callbacks)
WhisperSegment / WhisperToken — Result types with timestamps and probabilities

Core flow: WhisperContext::new_with_params() → ctx.create_state() → state.full(params, &audio_data) → iterate segments via state.as_iter()

Module layout in src/:

whisper_ctx.rs — Raw context wrapper (WhisperInnerContext)
whisper_ctx_wrapper.rs — Safe public WhisperContext
whisper_state/ — State management, segments, tokens, iterators
whisper_params.rs — FullParams, SamplingStrategy (Greedy / BeamSearch)
whisper_vad.rs — Voice Activity Detection
whisper_grammar.rs — GBNF grammar-constrained decoding
vulkan.rs — Vulkan device enumeration (behind vulkan feature)

Build System (sys/build.rs)

The build script:

Copies whisper.cpp sources into OUT_DIR
Runs bindgen on wrapper.h to generate FFI bindings (falls back to src/bindings.rs on failure)
Configures and builds whisper.cpp via CMake with feature-dependent flags (GGML_CUDA, GGML_HIP, GGML_VULKAN, GGML_METAL, etc.)
Statically links: whisper, ggml, ggml-base, ggml-cpu, plus backend-specific libs

Feature Flags

Feature	Purpose
`cuda`	NVIDIA GPU (needs CUDA toolkit)
`hipblas`	AMD GPU via ROCm
`metal`	Apple Metal GPU
`vulkan`	Vulkan GPU
`openblas`	OpenBLAS acceleration (requires `BLAS_INCLUDE_DIRS` env var)
`openmp`	OpenMP threading
`coreml`	Apple CoreML
`intel-sycl`	Intel SYCL
`raw-api`	Re-export `whisper-rs-sys` types publicly
`log_backend`	Route C++ logs to the `log` crate
`tracing_backend`	Route C++ logs to the `tracing` crate

Nix Development Environment

The flake.nix provides a devshell with all dependencies for Vulkan and ROCm/hipBLAS builds. Use direnv allow or nix develop to enter the environment. Key env vars (LIBCLANG_PATH, BINDGEN_EXTRA_CLANG_ARGS, HIP_PATH, VULKAN_SDK) are set automatically.

PR Conventions

Per .github/PULL_REQUEST_TEMPLATE.md:

Run cargo fmt and cargo clippy before submitting
Self-review code for legibility
No GenAI-generated code in PRs

4.1 KiB Raw Blame History