Remove CPU fallbacks for voice note transcribe; auto/cuda/cpu fail fast

- Remove _cuda_failed_models and inference-time CPU fallback - auto: try CUDA only, fail fast on RuntimeError (no CPU fallback) - cpu/cuda: use device directly, fail fast on errors - Update docs in config, .env.example, README Co-authored-by: Ali Khokhar <alishahryar2@gmail.com>
2026-04-28 11:30:03 +00:00 · 2026-02-18 13:37:23 +00:00 · 2026-02-18 13:37:23 +00:00 · eabe8db2e8
commit eabe8db2e8
parent f07d38655a
4 changed files with 10 additions and 32 deletions
--- a/.env.example
+++ b/.env.example
@ -38,7 +38,7 @@ WHISPER_MODEL=base
 HF_TOKEN=""


-# WHISPER_DEVICE: "cpu" | "cuda" | "auto" (auto = try GPU, fall back to CPU)
+# WHISPER_DEVICE: "cpu" | "cuda" | "auto" (auto = try CUDA, fail fast; no fallback)
 WHISPER_DEVICE=cpu