Skip to content

Conversation

@cicoyle
Copy link
Contributor

@cicoyle cicoyle commented Nov 10, 2025

Proposal to improve Actors and Workflows Reliability and Performance by consolidating Placement -> Scheduler

Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Cassandra Coyle <[email protected]>
Signed-off-by: Cassandra Coyle <[email protected]>
@WhitWaldo
Copy link
Contributor

This seems like a hands-down win across the board. Consolidation of the older Placement table into the newer Scheduler looks like it'd (significantly) reduce network I/O, CPU and memory usage while benefiting from the robust performance improvements iterated over the last several releases to Scheduler.

@yaron2
Copy link
Member

yaron2 commented Nov 18, 2025

Eligible voters: @dapr/maintainers-dapr

- After that, subsequent changes are per-type: only the affected types must pause, update, and resume.

Sidecar Startup vs steady-state
- On new stream: Scheduler sends LOCK(all) → UPDATE(full snapshot: all types, versions per type) → UNLOCK(all).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think that this will raise the startup time for daprd hosting actors? or no since this was still happening in placement v1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter - no startup time won’t increase because the full snapshot on new streams was already happening with the original Placement service.

But we should actually see some improvement here because we no longer need to hop between two separate control plane services for actors and their reminders. With this proposal, everything will funnel thru the Scheduler service - fewer hops and connections should result in improvements.

This enforces soft stickiness: per‑type updates do not stop actors that still map to the local sidecar, which avoids
unnecessary churn and short global pauses.
So, during LOCK([T]) → UPDATE([T]) → UNLOCK([T]), only actors of types [T] that moved to a remote owner are drained and
all others continue running. No namespace‑wide drain.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge win 🎉

@JoshVanL
Copy link
Contributor

JoshVanL commented Dec 2, 2025

+1 binding

1 similar comment
@yaron2
Copy link
Member

yaron2 commented Dec 2, 2025

+1 binding

@yaron2 yaron2 merged commit 81225a9 into dapr:main Dec 2, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants