Skip to content

Conversation

@Jiawei-Shao
Copy link
Contributor

Description

This patch implements the Split-K optimization on GEMM.

  1. Support handling GEMM in MatMulFillBiasOrZeroBeforeSplitKProgram.
    We need to add beta as a new uniform value and all the parameters that are used to handle all the cases of GEMM in MatMulWriteFnSource().
  2. Support Split-K in GemmProgram::GenerateShaderCode().

Motivation and Context

With this PR we can achieve about 30% improvement in florence-2-base-vision-encoder-fp16 and 10% improvement in detr-resnet-50-fp16.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant