-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[RFC] virtio-balloon: Add free page reporting hinting #5491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RFC] virtio-balloon: Add free page reporting hinting #5491
Conversation
c96f7be to
8ef7916
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #5491 +/- ##
==========================================
- Coverage 82.75% 82.74% -0.01%
==========================================
Files 269 269
Lines 27798 28126 +328
==========================================
+ Hits 23003 23274 +271
- Misses 4795 4852 +57
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Free page reporting is a mechanism in which the guest will notify the host of pages which are not currently in use. This feature can only be configured on boot and will continue to report continuously. With free page reporting firecracker will `MADV_DONTNEED` on the ranges reported. This allows the host to free up memory and reduce the RSS of the VM. With UFFD this is sent as the `UFFD_EVENT_REMOVE` after the call with `MADV_DONTNEED`. Signed-off-by: Jack Thomson <[email protected]>
Free page hinting is a mechanism which allows the guest driver to report ranges of pages to the host device. A "hinting" run is triggered by the device by issuing a new command id in the config space, after the update to the id the device will hint ranges to the host which are unused. Once the driver has exhausted all free ranges it notifies the device the run has completed. The device can then issue another command allowing the guest to reclaim these pages. Adding support for hinting the firecracker balloon device, we offer three points to manage the device; first to start a run, second to monitor the status and a final to issue the command to allow the guest to reclaim pages. To note, there is a potential condition in the linux driver which would allow a range to be reclaimed in an oom scenario before we remove the range. Signed-off-by: Jack Thomson <[email protected]>
Adding API endpoints to manage free page hinting . With three different endpoint: Start - To begin a new run for free page hinting, Status - To track the state of the hinting run, Stop - To stop the hinting run and allow the guest to reclaim the pages reported. Signed-off-by: Jack Thomson <[email protected]>
Add metrics to track free page hinting and reporting. For both devices track the number of ranges reported, the number of errors encountered while freeing and the total amount of memory freed. Signed-off-by: Jack Thomson <[email protected]>
8ef7916 to
5a1b473
Compare
Adding new resources to the http api to enable testing of the hinting functionality. Signed-off-by: Jack Thomson <[email protected]>
Add integration tests for free page hinting and reporting, both functional and performance tests. Update fast_page_helper so it can run in a oneshot mode, not requiring the signal to track the performance. New functional tests to ensure that hinting and reporting are reducing the RSS as expected in the guest. Updated reduce RSS test to touch memory to reduce the chance of flakiness. New performance tests for the balloon device. First being a test to track the CPU overhead of hinting and reporting. Second being a test to measure the faulting latency while reporting is running in the guest. Signed-off-by: Jack Thomson <[email protected]>
Add integration tests for free page hinting and reporting. Asserting the features are enabled correctly. Testing the config space updates triggered by hinting are being set as expected. Signed-off-by: Jack Thomson <[email protected]>
While the traditional balloon device would not be able to reclaim memory when back by huge pages, it could still technically be used to to restrict memory usage in the guest. With the addition of hinting and reporting, they report ranges in bigger sizes (4mb by default). Because of this, it is possible for the host reclaim huge pages backing the guest. Updates the performance tests for the balloon when back by huge pages, added varients to the size reduction tests to ensure hinting and reporting can reduce the RSS of the guest. Move the inflation test to performance to ensure it runs sequentially in CI otherwise the host can be exhausted of huge pages. Signed-off-by: Jack Thomson <[email protected]>
5a1b473 to
8dcf4fd
Compare
| // The feature bitmap for virtio balloon. | ||
| const VIRTIO_BALLOON_F_STATS_VQ: u32 = 1; // Enable statistics. | ||
| const VIRTIO_BALLOON_F_DEFLATE_ON_OOM: u32 = 2; // Deflate balloon on OOM. | ||
| const VIRTIO_BALLOON_F_FREE_PAGE_REPORTING: u32 = 5; // Enable free page reportin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g
| EventFd::new(libc::EFD_NONBLOCK).map_err(BalloonError::EventFd)?, | ||
| EventFd::new(libc::EFD_NONBLOCK).map_err(BalloonError::EventFd)?, | ||
| EventFd::new(libc::EFD_NONBLOCK).map_err(BalloonError::EventFd)?, | ||
| EventFd::new(libc::EFD_NONBLOCK).map_err(BalloonError::EventFd)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: at this stage a loop would look sensible
| parameters: | ||
| - name: body | ||
| in: body | ||
| description: When the device completes the hinting whether we shoud automatically ack this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/shoud/should/
| #include <sys/mman.h> // mmap | ||
| #include <time.h> // clock_gettime | ||
| #include <fcntl.h> // open | ||
| #include <getopt.h> // getopt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can extract the update to the helper into a commit and explain the changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I will do!
|
|
||
| time.sleep(1) | ||
|
|
||
| # Get the firecracker pid, and open an ssh connection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the bit about the ssh connection relevant here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll drop that good catch
| time.sleep(1) | ||
| microvm.api.balloon_hinting_start.patch() | ||
| elif method == "reporting": | ||
| time.sleep(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason why reporting requires a longer delay than hinting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reporting is expected to start in ~2 seconds and hinting in my testing takes ~200ms so that's why I've picked these. I can add a comment as they do seem like magic numbers
|
|
||
| # Wait for the deflate to complete. | ||
| _ = get_stable_rss_mem_by_pid(firecracker_pid) | ||
| if method == "none": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not a "traditional" device?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch thanks
| } | ||
|
|
||
| #[test] | ||
| fn test_process_hinting() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider splitting this test into multiple self-contained ones, each testing its own scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do thanks!
| with attempt: | ||
| return int(self.jailer.pid_file.read_text(encoding="ascii")) | ||
|
|
||
| @cached_property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can go to a separate commit as test refactoring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pulled from the mem hot-plugging PR so will be able to drop it once that lands :)
| time.sleep(sleep_duration) | ||
|
|
||
|
|
||
| # pylint: disable=C0103 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: moving the test before making changes can got to a separate commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do thanks!
Description
Adding support for virtio-balloon features: Free page hinting and reporting.
TODO
Update documentation on the update balloon features
Update release notes
...
Reason
...
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.