Workaround: Build failure with CUDA 13.0: "Unsupported CUDA version: 13" error in onnxruntime-node #26586
erkkimon
started this conversation in
Show & Tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Environment:
Problem Description:
When installing dependencies on a system with CUDA 13,
npm installfails during theonnxruntime-nodepost-install script with the following error:The error
Failed to detect CUDA version from nvcc --version: Unsupported CUDA version: 13occurs because onnxruntime-node doesn't yet have pre-built binaries for CUDA 13.Workaround - Force CPU Mode Installation:
better-sqlite3native bindings manually (required for database operations):Important Limitation:
This workaround installs onnxruntime-node in CPU-only mode so any local model inference using
@huggingface/transformerswill run on CPU, which may be slower for embeddings and other operations.Alternative Solution for GPU Acceleration:
If you need GPU acceleration for embeddings but cannot downgrade CUDA 13, I recommend using Ollama for embeddings while running vllama for LLM inference which is is a drop-in replacement for Ollama that runs inference on GPU while maintaining full API compatibility.
My Working Configuration:
I'm successfully using this setup with Perplexica with no speed penalty:
nomic-embed-text:lateston CPUhuihui_ai/deepseek-r1-abliterated:14bon GPUThis separation provides optimal performance:
To configure this in Perplexica:
ollama pull nomic-embed-text:latestollama pull huihui_ai/deepseek-r1-abliterated:14bI hope Perplexica will work soon with Cuda 13, but at least here is a workaround until that!
Beta Was this translation helpful? Give feedback.
All reactions