-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
perf(gltf): optimize inner-loop getTransformMatrix call #9261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(gltf): optimize inner-loop getTransformMatrix call #9261
Conversation
…unneccessary matrix math
|
Hi 👋, thank you for your PR! We've run benchmarks in an emulated environment. Here are the results: ARM Emulated 32b - lv_conf_perf32b
Detailed Results Per Scene
ARM Emulated 64b - lv_conf_perf64b
Detailed Results Per Scene
Disclaimer: These benchmarks were run in an emulated environment using QEMU with instruction counting mode. 🤖 This comment was automatically generated by a bot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 1 file
| [&](const TRS& trs) { | ||
| /* Note: There is some debate as to if it is more standard conformant to apply this line | ||
| * as translate(rotate(scale())), or scale(rotate(translate())). For now, it's still | ||
| * scale(rotate(translate())) to align with fastgltf's internals, but that may change - MK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * scale(rotate(translate())) to align with fastgltf's internals, but that may change - MK | |
| * scale(rotate(translate())) to align with fastgltf's internals, but that may change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh thanks, I'll remember that for future. That comment is gone now.
|
Note: my last commit to this PR incorporates a revised order of operations just recently updated in fastgltf. Be sure to update submodules before building this. |
|
Closing this PR to avoid confusion and merge conflicts since the order of operations it was changing has since been confirmed to be correct the original way, despite how it looks. #9273 is the confirmed correct way. I'll need to adjust two lines of 9273 to reflect the getLocalTransformMatrix call, but otherwise 9273 replaces this. |
There are several calls to getTransformMatrix, a fastgltf function, in our update loops that are called very often, especially for animated scenes. That function takes a 4x4 matrix and multiplies it against the local transform, to get a world transform, but in our calls to it we don't actually use that feature so it's just doing a bunch of multiplications for no good reason.
So this eliminates that unnecessary math to produce roughly 10-12% improved CPU-side performance, by my tests.