Skip to content

Conversation

@levzem
Copy link
Contributor

@levzem levzem commented Oct 24, 2025

Closes #16054

This PR adds a new LoopService called DBVacuum that allows for the automatic clean up of Prefect resources that are older than a configurable retention period. This is to help keep the Prefect DB from overflowing with old flow runs/task runs/etc.

The DBVacuum deletes the following:

  1. flow runs
  2. associated logs
  3. associated artifacts

Task runs are handled by the DELETE CASCADE rules setup on the flow run table.

The DBVacuum queries a limited amount of parent flow runs (those that do not have a parent_task_run_id) that are older than the retention and deletes them. It then deletes logs that no longer have any flow runs associated with them and finally performs the same deletion for artifacts.

The deletion of a parent flow run will delete child task runs and set their child flow runs parent_task_run_id to null, which will get picked up on the next round of the loop. This gives a guarantee that eventually all flow runs and their sub flow runs will be deleted (along with the associated logs and artifacts).

Checklist

  • This pull request references any related issue by including "closes <link to issue>"
    • If no issue exists and your change is not a small fix, please create an issue first.
  • If this pull request adds new functionality, it includes unit tests that cover the changes
  • If this pull request removes docs files, it includes redirect settings in mint.json.
  • If this pull request adds functions or classes, it includes helpful docstrings.

@github-actions github-actions bot added the performance Related to an optimization or performance improvement label Oct 24, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Oct 24, 2025

CodSpeed Performance Report

Merging #19280 will not alter performance

Comparing levzem:lev/vacuum (7225337) with main (7fcfb7e)

Summary

✅ 2 untouched

@levzem
Copy link
Contributor Author

levzem commented Oct 25, 2025

@stephane-additive fyi

@zzstoatzz
Copy link
Collaborator

hi @levzem - i appreciate the PR and have a couple initial thoughts:

  • something like this should exist
  • we should not introduce it as default on, as that would cause significant changes in behavior for server operators
  • we should probably get aligned in a github issue (that proposes a design, not just the original motivation of cleaning up the DB) so as to reduce back and forth on the PR

can you create an issue where you lay out your design here and why? this would be a non-trivial introduction to the server

@levzem
Copy link
Contributor Author

levzem commented Oct 27, 2025

@zzstoatzz thanks for the quick turnaround.

I agree it should be off by default, that was a copy pasta mistake on my end.

I have tagged you in a design proposal in #16054. Look forward to hearing your thoughts.

@zzstoatzz zzstoatzz marked this pull request as draft October 27, 2025 18:04
@zzstoatzz
Copy link
Collaborator

great thank you @levzem - marking this as a draft for the time being! will give feedback on the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Related to an optimization or performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto clean-up feature for the Prefect internal database

2 participants