Skip to content

Conversation

@krissetto
Copy link
Contributor

With the next release of DMR's inference server and CLI, they'll have support for speculative decoding.

This PR prepares cagent to be able to support this new feature on day 1. We can hold off merging until then

@krissetto krissetto self-assigned this Oct 29, 2025
@krissetto krissetto requested a review from a team as a code owner October 29, 2025 16:42
@krissetto krissetto added the area/docker-model-runner For features/issues/fixes related to the usage of Docker Model Runner (DMR) label Oct 29, 2025
@dgageot dgageot marked this pull request as draft November 3, 2025 16:52
@dgageot
Copy link
Member

dgageot commented Nov 3, 2025

I converted to draft in order to not merge it yet. @krissetto is that ok?

@krissetto
Copy link
Contributor Author

@dgageot yeah that's fine for me, feel free to review it anyway if you want.
I'll un-draft when the release of model-runner goes out so we can get it in, it's pretty useful to speed up local inference

@krissetto
Copy link
Contributor Author

Docker model runner 1.0 has been released

@krissetto krissetto marked this pull request as ready for review November 6, 2025 16:32
@krissetto krissetto force-pushed the speculative-decoding-dmr branch 3 times, most recently from d02ec5e to 6bf923c Compare November 10, 2025 17:51
@krissetto krissetto force-pushed the speculative-decoding-dmr branch from 6bf923c to 5e9ef60 Compare November 10, 2025 17:58
@dgageot dgageot merged commit 7bf2219 into docker:main Nov 11, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docker-model-runner For features/issues/fixes related to the usage of Docker Model Runner (DMR)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants