No option to set API timeout! #6445
Replies: 9 comments 2 replies
-
|
Would also really like to see this! Roo code keeps timing out on my slow, local models |
Beta Was this translation helpful? Give feedback.
-
|
same here please let us set the timeout settings. its timing out on 80k context with local model |
Beta Was this translation helpful? Give feedback.
-
|
Agreed. Universal timeout control would be beautiful. Cline implemented one for Ollama, but neglected LM Studio, which is a bummer since Ollama doesn't support MLX. Throw a bone to those of us living life in the local slow lane ;) |
Beta Was this translation helpful? Give feedback.
-
|
Are you guys not doing this on purpose? It's been months. How hard is to set up the timeout value in the GUI ? This is fundamental for local LLM users. |
Beta Was this translation helpful? Give feedback.
-
|
I am indexing a very large codebase, taking hours. Successful ollama logs look like this: I'm not sure if I'm reading it right, but if 58.715907463s is how long each request is taking to process, I'm near that 1 minute limit. These numbers range from 39 up to 1 min. There are "successful" runs like that between thousands of lines like this: I'm assuming this isn't a bug, and it's just the timeout. Being able to set this higher would be a huge help. |
Beta Was this translation helpful? Give feedback.
-
|
Not for LM Studio or Ollama as a provider, there is no such setting.
…On 06/11/2025 13:49, Hannes Rudolph wrote:
I'm confused.. this was solved months ago. Look under the provider settings.
—
Reply to this email directly, view it on GitHub <#6445 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWJXUBZHYOF3DMWPOJRO5L33NGVDAVCNFSM6AAAAACCXPJADWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZGE4TKNI>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Try this to prove the issue is actualy in RooCode, and not in lmstudio or mlx:
curl -X POST http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d "{
\"model\": ***@***.***\",
\"messages\": [{
\"role\": \"user\",
\"content\": $(jq -Rs . < long_file.txt)
}]
}"
Replace model name and file name with your model and your file, long enough to process for more than 300s.
It will work just fine.
Which means the issue is in RooCode, most likely in a library that has a hardcoded 300s timeout.
…On 06/11/2025 14:59, cybrah wrote:
Not solved for local models. I have to use a proxy server in between that sends blank deltas to keep the connection open.
—
Reply to this email directly, view it on GitHub <#6445 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWJXUGBMXXCPLRKUERLBSL33NO4FAVCNFSM6AAAAACCXPJADWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZGI2TSMY>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
I*Located *the issue in the *Undici *library in the OpenAI SDK.
I *replaced* undici with *Axios *and there were *no more timeouts*. It is a working short term solution.
But long term, Someone needs to fix Undici lib, and OpenAI SDK, RooCode, KiloCode, all those using it to become aware of the issue.
…On 06/11/2025 14:36, ***@***.*** wrote:
Not for LM Studio or Ollama as a provider, there is no such setting.
On 06/11/2025 13:49, Hannes Rudolph wrote:
>
> I'm confused.. this was solved months ago. Look under the provider settings.
>
> —
> Reply to this email directly, view it on GitHub <#6445 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWJXUBZHYOF3DMWPOJRO5L33NGVDAVCNFSM6AAAAACCXPJADWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZGE4TKNI>.
> You are receiving this because you commented.Message ID: ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
-
|
Ok, so here are my research results:
OpenAI SDK uses native *fetch() *and Native fetch() uses Node.js internal undici.
Looks like node developers hardcoded the *5 min* timeout in undici on purpose (base timeout?).
There is also another *10 minute *hardcoded timeout in there (keepAlive timeout?).
I think these need to be fixed in the Open AI SDK and the other projects *by not using native fetch.*
In the mean time the only way things worked for me is to replace it with Axios, but the developers should really look at this and find a solution, because even the cloud services will hit a 5 minute and/or 10 minute timeout at some point.
…On 06/11/2025 14:36, ***@***.*** wrote:
Not for LM Studio or Ollama as a provider, there is no such setting.
On 06/11/2025 13:49, Hannes Rudolph wrote:
>
> I'm confused.. this was solved months ago. Look under the provider settings.
>
> —
> Reply to this email directly, view it on GitHub <#6445 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABWJXUBZHYOF3DMWPOJRO5L33NGVDAVCNFSM6AAAAACCXPJADWVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBZGE4TKNI>.
> You are receiving this because you commented.Message ID: ***@***.***>
>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I can configure everything else about the API, But the default API timeout is just too aggressive for large, slow local models, where processing the gigantic system prompt takes longer than the default timeout. I'm trying the new qwen coder 480b which actually gets fantastic tokens per second because it's moe, but I simply can't use it because roo doesn't have patience to wait for it to process its system prompt. if there's already a way to do this from the gui, please let me know
in the meantime, how can I hack the hard coded timeout setting? Where does it live in the codebase? is there a guide to forking roo and using the fork with VSCode? thanks
Beta Was this translation helpful? Give feedback.
All reactions