Commit 18004a8
authored
feat: add seed offset args to sampler to allow cuda graph support (#2132)
<!-- .github/pull_request_template.md -->
## 📌 Description
This PR adds optional seed/offset args to all the sampler functions to
prevent calling the `get_seed_and_offset` function. If that function is
not called, we can potentially make the sampler forward call as part of
CUDAGraph and use that to replay it.
We can directly compute the Seed/offset values, before launching the
graph in a similar way to as it is being done in the current method and
pass them when making the flashinfer call
## 🔍 Related Issues
#978 : top_k_top_p_sampling_from_logits incompatible with torch.compile
+ CUDAGraph
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [x] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [x] I have installed the hooks with `pre-commit install`.
- [x] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [x] Tests have been added or updated as needed.
- [x] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Optional seed and offset parameters added to sampling APIs to enable
deterministic RNG control while remaining optional.
* **Tests**
* New tests verify reproducible sampling when using the same seed/offset
and variability when different values are used.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent d0d99d2 commit 18004a8
2 files changed
+189
-14
lines changed
0 commit comments