Skip to content

Conversation

@FishOfTheNorthStar
Copy link
Contributor

There are several calls to getTransformMatrix, a fastgltf function, in our update loops that are called very often, especially for animated scenes. That function takes a 4x4 matrix and multiplies it against the local transform, to get a world transform, but in our calls to it we don't actually use that feature so it's just doing a bunch of multiplications for no good reason.

So this eliminates that unnecessary math to produce roughly 10-12% improved CPU-side performance, by my tests.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 18, 2025

Hi 👋, thank you for your PR!

We've run benchmarks in an emulated environment. Here are the results:

ARM Emulated 32b - lv_conf_perf32b

Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
All scenes avg. 28 38 7 7 0
Detailed Results Per Scene
Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
Empty screen 11 33 0 0 0
Moving wallpaper 2 33 1 1 0
Single rectangle 0 50 0 0 0
Multiple rectangles 0 38 (-1) 0 0 0
Multiple RGB images 0 38 (-1) 0 0 0
Multiple ARGB images 9 42 0 0 0
Rotated ARGB images 54 44 (+1) 15 (+1) 15 (+1) 0
Multiple labels 7 (+2) 35 (+2) 0 0 0
Screen sized text 97 (+1) 47 20 20 0
Multiple arcs 33 (-6) 33 7 7 0
Containers 1 (-2) 37 0 0 0
Containers with overlay 97 (+8) 21 44 44 0
Containers with opa 17 37 (-1) 0 0 0
Containers with opa_layer 18 34 6 (+1) 6 (+1) 0
Containers with scrolling 46 (-1) 46 12 12 0
Widgets demo 67 40 16 16 0
All scenes avg. 28 38 7 7 0

ARM Emulated 64b - lv_conf_perf64b

Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
All scenes avg. 24 38 6 6 0
Detailed Results Per Scene
Scene Name Avg CPU (%) Avg FPS Avg Time (ms) Render Time (ms) Flush Time (ms)
Empty screen 11 33 0 0 0
Moving wallpaper 1 33 0 0 0
Single rectangle 0 49 (-1) 0 0 0
Multiple rectangles 0 46 0 0 0
Multiple RGB images 0 39 0 0 0
Multiple ARGB images 1 38 0 0 0
Rotated ARGB images 29 34 9 9 0
Multiple labels 3 37 (-2) 0 0 0
Screen sized text 82 (+1) 45 17 17 0
Multiple arcs 34 (+1) 33 (-1) 6 6 0
Containers 4 37 0 0 0
Containers with overlay 88 (+1) 23 41 41 0
Containers with opa 16 (+1) 37 1 (+1) 1 (+1) 0
Containers with opa_layer 7 (-1) 39 (+2) 1 1 0
Containers with scrolling 45 (+1) 47 (+1) 12 (+1) 12 (+1) 0
Widgets demo 66 (-1) 42 15 15 0
All scenes avg. 24 38 6 6 0

Disclaimer: These benchmarks were run in an emulated environment using QEMU with instruction counting mode.
The timing values represent relative performance metrics within this specific virtualized setup and should
not be interpreted as absolute real-world performance measurements. Values are deterministic and useful for
comparing different LVGL features and configurations, but may not correlate directly with performance on
physical hardware. The measurements are intended for comparative analysis only.


🤖 This comment was automatically generated by a bot.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@AndreCostaaa AndreCostaaa changed the title refactor(gltf): optimize inner-loop getTransformMatrix call perf(gltf): optimize inner-loop getTransformMatrix call Nov 18, 2025
AndreCostaaa
AndreCostaaa previously approved these changes Nov 18, 2025
[&](const TRS& trs) {
/* Note: There is some debate as to if it is more standard conformant to apply this line
* as translate(rotate(scale())), or scale(rotate(translate())). For now, it's still
* scale(rotate(translate())) to align with fastgltf's internals, but that may change - MK
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* scale(rotate(translate())) to align with fastgltf's internals, but that may change - MK
* scale(rotate(translate())) to align with fastgltf's internals, but that may change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks, I'll remember that for future. That comment is gone now.

@AndreCostaaa AndreCostaaa self-requested a review November 18, 2025 16:55
@FishOfTheNorthStar
Copy link
Contributor Author

Note: my last commit to this PR incorporates a revised order of operations just recently updated in fastgltf. Be sure to update submodules before building this.

@FishOfTheNorthStar
Copy link
Contributor Author

Closing this PR to avoid confusion and merge conflicts since the order of operations it was changing has since been confirmed to be correct the original way, despite how it looks. #9273 is the confirmed correct way. I'll need to adjust two lines of 9273 to reflect the getLocalTransformMatrix call, but otherwise 9273 replaces this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants