Skip to content

Conversation

@ShivanshTiwari1
Copy link

What does this PR do?

This PR resolves a minor documentation inconsistency found in the MixtralSparseMoeBlock.forward method located in src/transformers/models/mixtral/modular_mixtral.py.

The function definition correctly uses the parameters top_k_index and top_k_weights, but the docstring incorrectly referenced them as selected_experts and routing_weights.

This update ensures the documentation is accurate, which is crucial for code clarity and for any models that inherit this MoE block.


Fixes #41984

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case. (The linked issue shows approval.)
  • Did you make sure to update the documentation with your changes? (The change is the documentation update itself.)
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker (As tagged in the original issue thread)

@ShivanshTiwari1 ShivanshTiwari1 force-pushed the fix/mixtral-docstring-naming branch 2 times, most recently from 5cb1cb4 to 70b756c Compare November 4, 2025 09:43
@Rocketknight1
Copy link
Member

Hi @ShivanshTiwari1 the fixes look okay, but there are a lot of unrelated style changes in the code! Try to revert those, or do pip install -e .[quality] to get the style tools yourself and run make fixup or make style to ensure the code matches our formatting rules.

@ShivanshTiwari1 ShivanshTiwari1 force-pushed the fix/mixtral-docstring-naming branch from 70b756c to 8b06044 Compare November 4, 2025 13:25
@github-actions
Copy link
Contributor

github-actions bot commented Nov 4, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: mixtral

@diegoakel diegoakel mentioned this pull request Nov 5, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Variable name mismatch on MixtralExperts.forward()

2 participants