Remove CPU fallbacks for voice note transcribe; auto/cuda/cpu fail fast

- Remove _cuda_failed_models and inference-time CPU fallback
- auto: try CUDA only, fail fast on RuntimeError (no CPU fallback)
- cpu/cuda: use device directly, fail fast on errors
- Update docs in config, .env.example, README

Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
This commit is contained in:
Cursor Agent 2026-02-18 13:37:23 +00:00
parent f07d38655a
commit eabe8db2e8
4 changed files with 10 additions and 32 deletions

View file

@ -38,7 +38,7 @@ WHISPER_MODEL=base
HF_TOKEN=""
# WHISPER_DEVICE: "cpu" | "cuda" | "auto" (auto = try GPU, fall back to CPU)
# WHISPER_DEVICE: "cpu" | "cuda" | "auto" (auto = try CUDA, fail fast; no fallback)
WHISPER_DEVICE=cpu