Skip to content

Conversation

@DajanaV
Copy link
Collaborator

@DajanaV DajanaV commented Nov 14, 2025

Mirrored from ggml-org/llama.cpp#17257

  • currently cmake install targets of kleidiai generate error
  • set SOURCE_SUBDIR to explict invalid path as a bypass
  • Add option EXCLUDE_FROM_ALL will do the same fix but it requires camke > v3.28.

REF: https://discourse.cmake.org/t/prevent-fetchcontent-makeavailable-to-execute-cmakelists-txt/12704/3

Current install error log:

$ cmake --install build-kai --prefix install-kai
-- Install configuration: "Release"
-- Installing: /home/yongjoo/llama.cpp/install-kai/lib/libggml-cpu.so
CMake Error at build-kai/_deps/kleidiai_download-build/cmake_install.cmake:52 (file):
  file INSTALL cannot find
  "/home/yongjoo/llama.cpp/build-kai/bin/libkleidiai.so": No such file or
  directory.
Call Stack (most recent call first):
  build-kai/ggml/src/cmake_install.cmake:71 (include)
  build-kai/ggml/cmake_install.cmake:47 (include)
  build-kai/cmake_install.cmake:47 (include)

@loci-agentic-ai
Copy link

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

Pull Request #205 addresses a CMake build system issue related to KleidiAI dependency management. The changes are isolated to build configuration and do not modify runtime code or core inference functionality.

Performance Impact Assessment

Highest Performance Changes Detected:

  • Response Time: _RegexMask constructor in build.bin.llama-tts (+0.082%, +0.018 ns)
  • Throughput: _Optional_base constructor in build.bin.llama-run (-0.170%, -0.040 ns)

Core Function Impact Analysis:
No core inference functions were modified. The detected changes affect standard library constructors unrelated to:

  • llama_decode() - No changes detected
  • llama_encode() - No changes detected
  • llama_tokenize() - No changes detected

Tokens Per Second Impact:
No impact on inference throughput. The performance variations are in non-critical utility functions outside the tokenization/inference pipeline. Core processing functions remain unchanged, maintaining baseline performance characteristics.

Power Consumption Analysis:
Negligible energy efficiency changes across all binaries:

  • build.bin.libllama.so: +1.013 nJ (0.0004% increase)
  • build.bin.llama-run: -0.048 nJ (negligible decrease)
  • All other binaries show no measurable change (<0.001%)

Flame Graph and CFG Analysis:

  • _RegexMask constructor shows single-frame execution (22 ns) with no sub-function calls
  • Control flow graphs are identical between versions with matching assembly code
  • Performance variations represent measurement noise rather than functional changes

Code Review Findings:
The PR successfully resolves CMake installation failures by preventing KleidiAI build targets from executing during installation. Changes are appropriately scoped to build system configuration without affecting runtime behavior or API compatibility.

Conclusion

This build system fix introduces no performance regressions or functional changes. The minimal timing variations detected are within measurement precision limits and unrelated to the actual modifications made. The PR effectively addresses the stated installation issue while maintaining system performance and stability.

@DajanaV DajanaV force-pushed the main branch 23 times, most recently from 88cd3fd to c24d77f Compare November 17, 2025 01:37
* currently cmake install targets of kleidiai generate error
* use `FetchContent_Populate` to disable all build/install targets of kleidiai
* note that cmake would complains about the deprecation of `FetchContent_Populate`

Signed-off-by: Yongjoo Ahn <[email protected]>
@DajanaV DajanaV force-pushed the upstream-PR17257-branch_anyj0527-fix-kleidiai-install branch from a02af4c to 2664317 Compare November 17, 2025 05:37
@loci-agentic-ai
Copy link

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

The analysis examined version fdc9a92c-ce85-426c-b889-d4407ecc4314 against baseline 753a819c-e472-4f92-98cc-a780e0132be2 in the llama.cpp project. The changes primarily involve a CMake build system fix for KleidiAI integration, with minimal performance impact on runtime functions.

Performance Metrics

Highest Response Time Change:

  • Function: std::vector<llm_bigram_spm>::pop_back() in build.bin.libllama.so
  • Change: -0.10% (67 ns → 67 ns, -0.07 ns absolute improvement)

Highest Throughput Change:

  • Function: std::__detail::_Executor::_M_match_multiline() in build.bin.llama-run
  • Change: +0.11% (39 ns → 39 ns, +0.04 ns absolute increase)

Power Consumption:
All 16 binaries show negligible power consumption changes (≤0.001%). The largest change was +0.47 nJ in build.bin.libllama.so (+0.0002%).

Key Findings

Core Function Impact: None of the performance-critical inference functions (llama_decode, llama_encode, llama_tokenize) were modified or showed measurable performance changes. The affected functions are peripheral utilities in tokenization and regex processing.

Inference Performance: Token throughput remains unaffected as no core inference pipeline functions experienced meaningful changes. The sub-nanosecond variations in utility functions will not impact overall tokens per second performance.

Technical Analysis:

  • Flame Graph: Shows pop_back() as a single-node, 66 ns atomic operation with no subcalls, indicating optimized inlined implementation
  • CFG Comparison: Identical control flow graphs and assembly code between versions, confirming the performance differences stem from microarchitectural factors rather than code changes
  • Code Review: The primary change is a CMake build fix replacing FetchContent_MakeAvailable() with FetchContent_Populate() for KleidiAI integration, resolving installation errors without affecting runtime behavior

Conclusion: The version update maintains stable performance characteristics with all changes falling within statistical noise levels. The build system improvements enhance deployment reliability without impacting inference performance.

@DajanaV DajanaV force-pushed the main branch 5 times, most recently from f333350 to 9c4623f Compare November 18, 2025 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants