Add Wyoming protocol ASR server and nix devshell
New wyoming-whisper-rs binary crate implementing the Wyoming protocol over TCP, making whisper-rs usable with Home Assistant's voice pipeline. Includes nix flake devshell with Vulkan, ROCm/hipBLAS, clippy, and rustfmt support. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
d38738df8d
commit
50fdb08a38
12 changed files with 840 additions and 1 deletions
97
CLAUDE.md
Normal file
97
CLAUDE.md
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
whisper-rs provides safe Rust bindings to [whisper.cpp](https://github.com/ggerganov/whisper.cpp), a C++ speech recognition library. It's a two-crate workspace:
|
||||
|
||||
- **`whisper-rs`** (root) — Safe public API
|
||||
- **`whisper-rs-sys`** (`sys/`) — FFI bindings generated via bindgen, with a CMake-based build of the whisper.cpp submodule
|
||||
|
||||
The upstream C++ source lives in `sys/whisper.cpp/` (git submodule — clone with `--recursive`).
|
||||
|
||||
## Build Commands
|
||||
|
||||
```bash
|
||||
cargo build # Default build
|
||||
cargo build --release --features vulkan # With Vulkan GPU support
|
||||
cargo build --release --features hipblas # With AMD ROCm support
|
||||
cargo build --release --features cuda # With NVIDIA CUDA support
|
||||
cargo test # Run all tests
|
||||
cargo fmt # Format code
|
||||
cargo clippy # Lint
|
||||
```
|
||||
|
||||
**Running examples** (require a GGML model file and a WAV audio file):
|
||||
```bash
|
||||
cargo run --example basic_use -- model.bin audio.wav
|
||||
cargo run --example audio_transcription -- model.bin audio.wav
|
||||
cargo run --example vad -- model.bin audio.wav output.wav
|
||||
```
|
||||
|
||||
**Skipping bindgen** (use pre-generated bindings): set `WHISPER_DONT_GENERATE_BINDINGS=1`.
|
||||
|
||||
All `WHISPER_*` and `CMAKE_*` env vars are forwarded to the CMake build.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
whisper-rs (safe Rust API)
|
||||
→ whisper-rs-sys (bindgen FFI)
|
||||
→ whisper.cpp (C++ submodule, built via CMake)
|
||||
→ GGML (tensor library with CPU/GPU backends)
|
||||
```
|
||||
|
||||
**Key types and their relationships:**
|
||||
|
||||
- `WhisperContext` — Arc-wrapped model handle, thread-safe, created from a model file
|
||||
- `WhisperState` — Inference state created from a context; multiple states can share one context
|
||||
- `FullParams` — Transcription configuration (sampling strategy, language, callbacks)
|
||||
- `WhisperSegment` / `WhisperToken` — Result types with timestamps and probabilities
|
||||
|
||||
**Core flow:** `WhisperContext::new_with_params()` → `ctx.create_state()` → `state.full(params, &audio_data)` → iterate segments via `state.as_iter()`
|
||||
|
||||
**Module layout in `src/`:**
|
||||
- `whisper_ctx.rs` — Raw context wrapper (`WhisperInnerContext`)
|
||||
- `whisper_ctx_wrapper.rs` — Safe public `WhisperContext`
|
||||
- `whisper_state/` — State management, segments, tokens, iterators
|
||||
- `whisper_params.rs` — `FullParams`, `SamplingStrategy` (Greedy / BeamSearch)
|
||||
- `whisper_vad.rs` — Voice Activity Detection
|
||||
- `whisper_grammar.rs` — GBNF grammar-constrained decoding
|
||||
- `vulkan.rs` — Vulkan device enumeration (behind `vulkan` feature)
|
||||
|
||||
## Build System (sys/build.rs)
|
||||
|
||||
The build script:
|
||||
1. Copies `whisper.cpp` sources into `OUT_DIR`
|
||||
2. Runs bindgen on `wrapper.h` to generate FFI bindings (falls back to `src/bindings.rs` on failure)
|
||||
3. Configures and builds whisper.cpp via CMake with feature-dependent flags (`GGML_CUDA`, `GGML_HIP`, `GGML_VULKAN`, `GGML_METAL`, etc.)
|
||||
4. Statically links: `whisper`, `ggml`, `ggml-base`, `ggml-cpu`, plus backend-specific libs
|
||||
|
||||
## Feature Flags
|
||||
|
||||
| Feature | Purpose |
|
||||
|---------|---------|
|
||||
| `cuda` | NVIDIA GPU (needs CUDA toolkit) |
|
||||
| `hipblas` | AMD GPU via ROCm |
|
||||
| `metal` | Apple Metal GPU |
|
||||
| `vulkan` | Vulkan GPU |
|
||||
| `openblas` | OpenBLAS acceleration (requires `BLAS_INCLUDE_DIRS` env var) |
|
||||
| `openmp` | OpenMP threading |
|
||||
| `coreml` | Apple CoreML |
|
||||
| `intel-sycl` | Intel SYCL |
|
||||
| `raw-api` | Re-export `whisper-rs-sys` types publicly |
|
||||
| `log_backend` | Route C++ logs to the `log` crate |
|
||||
| `tracing_backend` | Route C++ logs to the `tracing` crate |
|
||||
|
||||
## Nix Development Environment
|
||||
|
||||
The `flake.nix` provides a devshell with all dependencies for Vulkan and ROCm/hipBLAS builds. Use `direnv allow` or `nix develop` to enter the environment. Key env vars (`LIBCLANG_PATH`, `BINDGEN_EXTRA_CLANG_ARGS`, `HIP_PATH`, `VULKAN_SDK`) are set automatically.
|
||||
|
||||
## PR Conventions
|
||||
|
||||
Per `.github/PULL_REQUEST_TEMPLATE.md`:
|
||||
- Run `cargo fmt` and `cargo clippy` before submitting
|
||||
- Self-review code for legibility
|
||||
- No GenAI-generated code in PRs
|
||||
Loading…
Add table
Add a link
Reference in a new issue