Skip to content

Commit b1dc127

Browse files
Raghuveer Devulapallir-devulap
authored andcommitted
qgemm: optimize avxvnni QGEMM inner kernel for M=1
QGEMM Benchmarks when M = 1 on an 13th Gen Intel(R) Core(TM) i9-13900K shows a 1.4x improvement on a single thread. |--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------| | Benchmark | Time | CPU | Time Old | Time New | CPU Old | CPU New | |--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------| | QGEMM/UnsignedAPackB/M:1/N:512/K:512/Batch:1/Threads:1/real_time | -0.275 | -0.2756 | 4330 | 3137 | 4330 | 3136 | | QGEMM/UnsignedAPackB/M:1/N:512/K:1024/Batch:1/Threads:1/real_time | -0.292 | -0.2927 | 9027 | 6385 | 9027 | 6385 | | QGEMM/UnsignedAPackB/M:1/N:1024/K:1024/Batch:1/Threads:1/real_time | -0.300 | -0.3005 | 17867 | 12499 | 17866 | 12498 | | OVERALL_GEOMEAN | -0.289 | -0.2897 | | | | | |--------------------------------------------------------------------+--------+---------+----------+----------+---------+---------|
1 parent db898b2 commit b1dc127

File tree

4 files changed

+555
-99
lines changed

4 files changed

+555
-99
lines changed

0 commit comments

Comments
 (0)