Skip to content

Conversation

@gramalingam
Copy link
Collaborator

@gramalingam gramalingam commented Nov 5, 2025

Implement SDPA via MHA. This handles the case when earlier fusion rules do not map larger patterns containing SDPA into MHA or GQA or Attention (from ORT contrib ops). It implements SDPA via MHA.

Signed-off-by: Ganesan Ramalingam <[email protected]>
@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 60.60606% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.12%. Comparing base (d80575d) to head (21a1594).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
onnxscript/rewriter/ort_fusions/sdpa_test.py 42.10% 11 Missing ⚠️
onnxscript/rewriter/ort_fusions/sdpa_via_mha.py 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2683      +/-   ##
==========================================
+ Coverage   70.11%   70.12%   +0.01%     
==========================================
  Files         224      225       +1     
  Lines       26982    27063      +81     
  Branches     2705     2719      +14     
==========================================
+ Hits        18919    18979      +60     
- Misses       7129     7146      +17     
- Partials      934      938       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Ganesan Ramalingam <[email protected]>
@gramalingam gramalingam changed the title [DRAFT] Implement SDPA via MHA Implement SDPA via MHA Nov 5, 2025
@gramalingam gramalingam marked this pull request as ready for review November 5, 2025 22:31

class SDPAImplementation(pattern.RewriteRuleClassBase):
def pattern(self, op, query, key, value):
def pattern(self, op, query, key, value, key_format):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a docstring for these params?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking ... [ok, I picked that from chatgpt :-)] ... I suspect what you are asking for is a "spec" for the SDPA (the internal op used by fusion, which does not correspond to any ONNX standard op or ORT contrib op, though it is a close approximation to ONNX's Attention). Is that right? Because that op is used across multiple fusions and files, the sdpa.py file might be the right place for that. Because the 1-liner docstring for these params would be that they correspond to the corresponding inputs and attribute of the SDPA op ... which doesn't say anything more than what the reader can see in the next few lines.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added documentation as discussed above.

Signed-off-by: Ganesan Ramalingam <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

3 participants