Can I set the embedding model in CPU memory and the LLM model in GPU with Ollama? #8334
Replies: 2 comments 2 replies
-
|
pls check https://ragflow.io/docs/dev/deploy_local_llm @ShiShuyang |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
Under the help of ChatGPT, this problem has been solved. Then, run the following command to create the model in Ollama: After that, you can run the model using: This will force the model to run on CPU only, which is useful if you don't have a GPU or want to test performance in a CPU-only environment. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Limited by the GPU memory, I have to move the embedding model to CPU memory. How can I deploy it with Ollama? Many thanks for your help!
Beta Was this translation helpful? Give feedback.
All reactions