# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview whisper-rs provides safe Rust bindings to [whisper.cpp](https://github.com/ggerganov/whisper.cpp), a C++ speech recognition library. It's a two-crate workspace: - **`whisper-rs`** (root) — Safe public API - **`whisper-rs-sys`** (`sys/`) — FFI bindings generated via bindgen, with a CMake-based build of the whisper.cpp submodule The upstream C++ source lives in `sys/whisper.cpp/` (git submodule — clone with `--recursive`). ## Build Commands ```bash cargo build # Default build cargo build --release --features vulkan # With Vulkan GPU support cargo build --release --features hipblas # With AMD ROCm support cargo build --release --features cuda # With NVIDIA CUDA support cargo test # Run all tests cargo fmt # Format code cargo clippy # Lint ``` **Running examples** (require a GGML model file and a WAV audio file): ```bash cargo run --example basic_use -- model.bin audio.wav cargo run --example audio_transcription -- model.bin audio.wav cargo run --example vad -- model.bin audio.wav output.wav ``` **Skipping bindgen** (use pre-generated bindings): set `WHISPER_DONT_GENERATE_BINDINGS=1`. All `WHISPER_*` and `CMAKE_*` env vars are forwarded to the CMake build. ## Architecture ``` whisper-rs (safe Rust API) → whisper-rs-sys (bindgen FFI) → whisper.cpp (C++ submodule, built via CMake) → GGML (tensor library with CPU/GPU backends) ``` **Key types and their relationships:** - `WhisperContext` — Arc-wrapped model handle, thread-safe, created from a model file - `WhisperState` — Inference state created from a context; multiple states can share one context - `FullParams` — Transcription configuration (sampling strategy, language, callbacks) - `WhisperSegment` / `WhisperToken` — Result types with timestamps and probabilities **Core flow:** `WhisperContext::new_with_params()` → `ctx.create_state()` → `state.full(params, &audio_data)` → iterate segments via `state.as_iter()` **Module layout in `src/`:** - `whisper_ctx.rs` — Raw context wrapper (`WhisperInnerContext`) - `whisper_ctx_wrapper.rs` — Safe public `WhisperContext` - `whisper_state/` — State management, segments, tokens, iterators - `whisper_params.rs` — `FullParams`, `SamplingStrategy` (Greedy / BeamSearch) - `whisper_vad.rs` — Voice Activity Detection - `whisper_grammar.rs` — GBNF grammar-constrained decoding - `vulkan.rs` — Vulkan device enumeration (behind `vulkan` feature) ## Build System (sys/build.rs) The build script: 1. Copies `whisper.cpp` sources into `OUT_DIR` 2. Runs bindgen on `wrapper.h` to generate FFI bindings (falls back to `src/bindings.rs` on failure) 3. Configures and builds whisper.cpp via CMake with feature-dependent flags (`GGML_CUDA`, `GGML_HIP`, `GGML_VULKAN`, `GGML_METAL`, etc.) 4. Statically links: `whisper`, `ggml`, `ggml-base`, `ggml-cpu`, plus backend-specific libs ## Feature Flags | Feature | Purpose | |---------|---------| | `cuda` | NVIDIA GPU (needs CUDA toolkit) | | `hipblas` | AMD GPU via ROCm | | `metal` | Apple Metal GPU | | `vulkan` | Vulkan GPU | | `openblas` | OpenBLAS acceleration (requires `BLAS_INCLUDE_DIRS` env var) | | `openmp` | OpenMP threading | | `coreml` | Apple CoreML | | `intel-sycl` | Intel SYCL | | `raw-api` | Re-export `whisper-rs-sys` types publicly | | `log_backend` | Route C++ logs to the `log` crate | | `tracing_backend` | Route C++ logs to the `tracing` crate | ## Nix Development Environment The `flake.nix` provides a devshell with all dependencies for Vulkan and ROCm/hipBLAS builds. Use `direnv allow` or `nix develop` to enter the environment. Key env vars (`LIBCLANG_PATH`, `BINDGEN_EXTRA_CLANG_ARGS`, `HIP_PATH`, `VULKAN_SDK`) are set automatically. ## PR Conventions Per `.github/PULL_REQUEST_TEMPLATE.md`: - Run `cargo fmt` and `cargo clippy` before submitting - Self-review code for legibility - No GenAI-generated code in PRs