UPSTREAM PR #17257: ggml-cpu : prevent kleidiai build/install targets #205

DajanaV · 2025-11-14T06:43:39Z

currently cmake install targets of kleidiai generate error
set SOURCE_SUBDIR to explict invalid path as a bypass
Add option EXCLUDE_FROM_ALL will do the same fix but it requires camke > v3.28.

REF: https://discourse.cmake.org/t/prevent-fetchcontent-makeavailable-to-execute-cmakelists-txt/12704/3

Current install error log:

$ cmake --install build-kai --prefix install-kai
-- Install configuration: "Release"
-- Installing: /home/yongjoo/llama.cpp/install-kai/lib/libggml-cpu.so
CMake Error at build-kai/_deps/kleidiai_download-build/cmake_install.cmake:52 (file):
  file INSTALL cannot find
  "/home/yongjoo/llama.cpp/build-kai/bin/libkleidiai.so": No such file or
  directory.
Call Stack (most recent call first):
  build-kai/ggml/src/cmake_install.cmake:71 (include)
  build-kai/ggml/cmake_install.cmake:47 (include)
  build-kai/cmake_install.cmake:47 (include)

loci-agentic-ai · 2025-11-14T07:18:52Z

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

Pull Request #205 addresses a CMake build system issue related to KleidiAI dependency management. The changes are isolated to build configuration and do not modify runtime code or core inference functionality.

Performance Impact Assessment

Highest Performance Changes Detected:

Response Time: _RegexMask constructor in build.bin.llama-tts (+0.082%, +0.018 ns)
Throughput: _Optional_base constructor in build.bin.llama-run (-0.170%, -0.040 ns)

Core Function Impact Analysis:
No core inference functions were modified. The detected changes affect standard library constructors unrelated to:

llama_decode() - No changes detected
llama_encode() - No changes detected
llama_tokenize() - No changes detected

Tokens Per Second Impact:
No impact on inference throughput. The performance variations are in non-critical utility functions outside the tokenization/inference pipeline. Core processing functions remain unchanged, maintaining baseline performance characteristics.

Power Consumption Analysis:
Negligible energy efficiency changes across all binaries:

build.bin.libllama.so: +1.013 nJ (0.0004% increase)
build.bin.llama-run: -0.048 nJ (negligible decrease)
All other binaries show no measurable change (<0.001%)

Flame Graph and CFG Analysis:

_RegexMask constructor shows single-frame execution (22 ns) with no sub-function calls
Control flow graphs are identical between versions with matching assembly code
Performance variations represent measurement noise rather than functional changes

Code Review Findings:
The PR successfully resolves CMake installation failures by preventing KleidiAI build targets from executing during installation. Changes are appropriately scoped to build system configuration without affecting runtime behavior or API compatibility.

Conclusion

This build system fix introduces no performance regressions or functional changes. The minimal timing variations detected are within measurement precision limits and unrelated to the actual modifications made. The PR effectively addresses the stated installation issue while maintaining system performance and stability.

* currently cmake install targets of kleidiai generate error * use `FetchContent_Populate` to disable all build/install targets of kleidiai * note that cmake would complains about the deprecation of `FetchContent_Populate` Signed-off-by: Yongjoo Ahn <[email protected]>

loci-agentic-ai · 2025-11-17T06:15:51Z

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

The analysis examined version fdc9a92c-ce85-426c-b889-d4407ecc4314 against baseline 753a819c-e472-4f92-98cc-a780e0132be2 in the llama.cpp project. The changes primarily involve a CMake build system fix for KleidiAI integration, with minimal performance impact on runtime functions.

Performance Metrics

Highest Response Time Change:

Function: std::vector<llm_bigram_spm>::pop_back() in build.bin.libllama.so
Change: -0.10% (67 ns → 67 ns, -0.07 ns absolute improvement)

Highest Throughput Change:

Function: std::__detail::_Executor::_M_match_multiline() in build.bin.llama-run
Change: +0.11% (39 ns → 39 ns, +0.04 ns absolute increase)

Power Consumption:
All 16 binaries show negligible power consumption changes (≤0.001%). The largest change was +0.47 nJ in build.bin.libllama.so (+0.0002%).

Key Findings

Core Function Impact: None of the performance-critical inference functions (llama_decode, llama_encode, llama_tokenize) were modified or showed measurable performance changes. The affected functions are peripheral utilities in tokenization and regex processing.

Inference Performance: Token throughput remains unaffected as no core inference pipeline functions experienced meaningful changes. The sub-nanosecond variations in utility functions will not impact overall tokens per second performance.

Technical Analysis:

Flame Graph: Shows pop_back() as a single-node, 66 ns atomic operation with no subcalls, indicating optimized inlined implementation
CFG Comparison: Identical control flow graphs and assembly code between versions, confirming the performance differences stem from microarchitectural factors rather than code changes
Code Review: The primary change is a CMake build fix replacing FetchContent_MakeAvailable() with FetchContent_Populate() for KleidiAI integration, resolving installation errors without affecting runtime behavior

Conclusion: The version update maintains stable performance characteristics with all changes falling within statistical noise levels. The build system improvements enhance deployment reliability without impacting inference performance.

DajanaV temporarily deployed to PROD__AL_DEMO November 14, 2025 06:43 — with GitHub Actions Inactive

DajanaV force-pushed the main branch 23 times, most recently from 88cd3fd to c24d77f Compare November 17, 2025 01:37

DajanaV force-pushed the upstream-PR17257-branch_anyj0527-fix-kleidiai-install branch from a02af4c to 2664317 Compare November 17, 2025 05:37

DajanaV temporarily deployed to PROD__AL_DEMO November 17, 2025 05:37 — with GitHub Actions Inactive

DajanaV force-pushed the main branch from c24d77f to a6141bf Compare November 17, 2025 10:10

DajanaV force-pushed the main branch 5 times, most recently from f333350 to 9c4623f Compare November 18, 2025 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UPSTREAM PR #17257: ggml-cpu : prevent kleidiai build/install targets #205

UPSTREAM PR #17257: ggml-cpu : prevent kleidiai build/install targets #205

Uh oh!

DajanaV commented Nov 14, 2025

Uh oh!

loci-agentic-ai bot commented Nov 14, 2025

Uh oh!

loci-agentic-ai bot commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

UPSTREAM PR #17257: ggml-cpu : prevent kleidiai build/install targets #205

Are you sure you want to change the base?

UPSTREAM PR #17257: ggml-cpu : prevent kleidiai build/install targets #205

Uh oh!

Conversation

DajanaV commented Nov 14, 2025

Uh oh!

loci-agentic-ai bot commented Nov 14, 2025

Performance Analysis Summary

Overview

Performance Impact Assessment

Conclusion

Uh oh!

loci-agentic-ai bot commented Nov 17, 2025

Performance Analysis Summary

Overview

Performance Metrics

Key Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants