From 60c62f8f84274d80d0d83718b3e6026afa88a191 Mon Sep 17 00:00:00 2001 From: Quentin Fuxa Date: Tue, 25 Nov 2025 23:31:46 +0100 Subject: [PATCH] troubleshooting #271 #276 #284 #286 --- README.md | 6 ++- docs/troubleshooting.md | 113 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 117 insertions(+), 2 deletions(-) create mode 100644 docs/troubleshooting.md diff --git a/README.md b/README.md index b867199..9ae9f08 100644 --- a/README.md +++ b/README.md @@ -51,9 +51,11 @@ pip install whisperlivekit 2. **Open your browser** and navigate to `http://localhost:8000`. Start speaking and watch your words appear in real-time! -> - See [tokenizer.py](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/simul_whisper/whisper/tokenizer.py) for the list of all available languages. -> - For HTTPS requirements, see the **Parameters** section for SSL configuration options. +> - See [here](https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/simul_whisper/whisper/tokenizer.py) for the list of all available languages. +> - Check the [troubleshooting guide](docs/troubleshooting.md) for step-by-step fixes collected from recent GPU setup/env issues. > - The CLI entry point is exposed as both `wlk` and `whisperlivekit-server`; they are equivalent. +> - For HTTPS requirements, see the **Parameters** section for SSL configuration options. + #### Use it to capture audio from web pages. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md new file mode 100644 index 0000000..2581a41 --- /dev/null +++ b/docs/troubleshooting.md @@ -0,0 +1,113 @@ +# Troubleshooting + + +## GPU drivers & cuDNN visibility + +### Linux error: `Unable to load libcudnn_ops.so* / cudnnCreateTensorDescriptor` +> Reported in issue #271 (Arch/CachyOS) + +`faster-whisper` (used for the SimulStreaming encoder) dynamically loads cuDNN. +If the runtime cannot find `libcudnn_*`, verify that CUDA and cuDNN match the PyTorch build you installed: + +1. **Install CUDA + cuDNN** (Arch/CachyOS example): + ```bash + sudo pacman -S cuda cudnn + sudo ldconfig + ``` +2. **Make sure the shared objects are visible**: + ```bash + ls /usr/lib/libcudnn* + ``` +3. **Check what CUDA version PyTorch expects** and match that with the driver you installed: + ```bash + python - <<'EOF' + import torch + print(torch.version.cuda) + EOF + nvcc --version + ``` +4. If you installed CUDA in a non-default location, export `CUDA_HOME` and add `$CUDA_HOME/lib64` to `LD_LIBRARY_PATH`. + +Once the CUDA/cuDNN versions match, `whisperlivekit-server` starts normally. + +### Windows error: `Could not locate cudnn_ops64_9.dll` +> Reported in issue #286 (Conda on Windows) + +PyTorch bundles cuDNN DLLs inside your environment (`\Lib\site-packages\torch\lib`). +When `ctranslate2` or `faster-whisper` cannot find `cudnn_ops64_9.dll`: + +1. Locate the DLL shipped with PyTorch, e.g. + ``` + E:\conda\envs\WhisperLiveKit\Lib\site-packages\torch\lib\cudnn_ops64_9.dll + ``` +2. Add that directory to your `PATH` **or** copy the `cudnn_*64_9.dll` files into a directory that is already on `PATH` (such as the environment's `Scripts/` folder). +3. Restart the shell before launching `wlk`. + +Installing NVIDIA's standalone cuDNN 9.x and pointing `PATH`/`CUDNN_PATH` to it works as well, but is usually not required. + +--- + +## PyTorch / CTranslate2 GPU builds + +### `Torch not compiled with CUDA enabled` +> Reported in issue #284 + +If `torch.zeros(1).cuda()` raises that assertion it means you installed a CPU-only wheel. +Install the GPU-enabled wheels that match your CUDA toolkit: + +```bash +pip install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130 +``` + +Replace `cu130` with the CUDA version supported by your driver (see [PyTorch install selector](https://pytorch.org/get-started/locally/)). +Validate with: + +```python +import torch +print(torch.cuda.is_available(), torch.cuda.get_device_name()) +``` + +### `CTranslate2 device count: 0` or `Could not infer dtype of ctranslate2._ext.StorageView` +> Follow-up in issue #284 + +`ctranslate2` publishes separate CPU and CUDA wheels. The default `pip install ctranslate2` brings the CPU build, which makes WhisperLiveKit fall back to CPU tensors and leads to the dtype error above. + +1. Uninstall the CPU build: `pip uninstall -y ctranslate2`. +2. Install the CUDA wheel that matches your toolkit (example for CUDA 13.0): + ```bash + pip install ctranslate2==4.5.0 -f https://opennmt.net/ctranslate2/whl/cu130 + ``` + (See the [CTranslate2 installation table](https://opennmt.net/CTranslate2/installation.html) for other CUDA versions.) +3. Verify: + ```python + import ctranslate2 + print("CUDA devices:", ctranslate2.get_cuda_device_count()) + ``` + +If you intentionally want CPU inference, run `wlk --backend whisper` to avoid mixing CPU-only CTranslate2 with a GPU Torch build. + +--- + +## Hopper / Blackwell (`sm_121a`) systems +> Reported in issue #276 (NVIDIA DGX Spark) + +CUDA 12.1a GPUs ship before some toolchains know about the architecture ID, so Triton/PTXAS need manual hints: + +```bash +export CUDA_HOME="/usr/local/cuda-13.0" +export PATH="$CUDA_HOME/bin:$PATH" +export LD_LIBRARY_PATH="$CUDA_HOME/lib64:$LD_LIBRARY_PATH" + +# Tell Triton where the new ptxas lives +export TRITON_PTXAS_PATH="$CUDA_HOME/bin/ptxas" + +# Force PyTorch to JIT kernels for all needed architectures +export TORCH_CUDA_ARCH_LIST="8.0 9.0 10.0 12.0 12.1a" +``` + +After exporting those variables (or adding them to your systemd service / shell profile), restart `wlk`. Incoming streams will now compile kernels targeting `sm_121a` without crashing. + +--- + +Need help with another recurring issue? Open a GitHub discussion or PR and reference this document so we can keep it current. +