server : handle failures to restore host cache #17078

ggerganov · 2025-11-07T13:58:17Z

With unified caches, restoring an old prompt from the host-memory cache is not guaranteed to be successful because there might not be enough free room in the context memory to fit it. Handle this gracefully by reprocessing the prompt from scratch.

server : handle failures to restore host cache

dc290c8

ggerganov mentioned this pull request Nov 7, 2025

Misc. bug: docker for llama server crashing with gpt-oss-20b #17060

Closed

github-actions bot added examples server labels Nov 7, 2025

DajanaV mentioned this pull request Nov 7, 2025

UPSTREAM PR #17078: server : handle failures to restore host cache auroralabs-loci/llama.cpp#119

Open

server : add tests for the prompt cache

0d38370

ggerganov marked this pull request as ready for review November 8, 2025 08:45

ggerganov requested a review from ngxson as a code owner November 8, 2025 08:45

github-actions bot added the python python script changes label Nov 8, 2025

ggerganov merged commit cb1adf8 into master Nov 9, 2025
81 of 83 checks passed

ggerganov deleted the gg/server-cache-failures branch November 9, 2025 12:27

ggerganov mentioned this pull request Nov 9, 2025

Misc. bug: llama-server crashes with gpt-oss-20b pos_min == -1, but n_past > 0 - should not happen #17118

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server : handle failures to restore host cache #17078

server : handle failures to restore host cache #17078

ggerganov commented Nov 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

server : handle failures to restore host cache #17078

server : handle failures to restore host cache #17078

Conversation

ggerganov commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ggerganov commented Nov 7, 2025 •

edited

Loading