Fix mixed torch.Tensor and DTensor in generate when use FSDP2 + LoRA #42436

Xiao-Chenguang · 2025-11-27T02:35:04Z

What does this PR do?

model.generate raise an Fix mixed torch.Tensor and DTensor when FSDP2 and LoRA is applied.

The trainer class is the base class for trl trainers like PPOTrainer and GRPOTrainer in which model.generate is called during the training loop.
When FSDP2 is used, all gather and reshard is automatically done for forward method but not for generate.
This lead to the error above.

To fix it, there is a method register_fsdp_forward_method for torch to manage the DTensor for us.
By registering generate, we can get ride of the error.

generate is not properly supported under FSDP2 accelerating. This PR try to fix this error.

Fixes #42417 (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Rocketknight1 · 2025-11-27T14:44:11Z

cc @3outeille maybe?

3outeille · 2025-12-02T10:50:04Z

taking a look today

3outeille

lgtm but modify to this

if self.is_fsdp_enabled:
	self.model = self.model_wrapped = model
    # Fix `got mixed torch.Tensor and DTensor` error in model.generate() for FSDP2 with LoRA
    dist.fsdp.register_fsdp_forward_method(self.model, "generate")

…for trainer.mode.generate

Xiao-Chenguang · 2025-12-03T03:16:07Z

lgtm but modify to this

if self.is_fsdp_enabled:
	self.model = self.model_wrapped = model
    # Fix `got mixed torch.Tensor and DTensor` error in model.generate() for FSDP2 with LoRA
    dist.fsdp.register_fsdp_forward_method(self.model, "generate")

Updated

3outeille requested changes Dec 2, 2025

View reviewed changes

Xiao-Chenguang and others added 2 commits December 3, 2025 11:13

Fix mixed torch.Tensor and DTensor error by registering fsdp forward …

9cabb1c

…for trainer.mode.generate

Apply fsdp forward register when fsdp is enabled

ee9152e

Xiao-Chenguang force-pushed the main branch from 7a0c966 to ee9152e Compare December 3, 2025 03:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix mixed torch.Tensor and DTensor in generate when use FSDP2 + LoRA #42436

Fix mixed torch.Tensor and DTensor in generate when use FSDP2 + LoRA #42436

Xiao-Chenguang commented Nov 27, 2025 •

edited

Loading

Uh oh!

Rocketknight1 commented Nov 27, 2025

Uh oh!

3outeille commented Dec 2, 2025

Uh oh!

3outeille left a comment

Uh oh!

Xiao-Chenguang commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix mixed torch.Tensor and DTensor in generate when use FSDP2 + LoRA #42436

Are you sure you want to change the base?

Fix mixed torch.Tensor and DTensor in generate when use FSDP2 + LoRA #42436

Conversation

Xiao-Chenguang commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Rocketknight1 commented Nov 27, 2025

Uh oh!

3outeille commented Dec 2, 2025

Uh oh!

3outeille left a comment

Choose a reason for hiding this comment

Uh oh!

Xiao-Chenguang commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Xiao-Chenguang commented Nov 27, 2025 •

edited

Loading