Skip to content

Conversation

@Radu2k
Copy link

@Radu2k Radu2k commented Nov 4, 2025

[build][cmake]: Bundle ACL dynlibs and torch libgomp for CPU extension builds

Purpose

Bundle ACL dynamic libs libarm_compute.so and libarm_compute_graph.so on AArch64 into .libs package folder. Use the same libgomp as PyTorch package for oneDNN and ACL build by adding path to LD_LIBRARY_PATH .

Fixes: #28056

Test Plan

  • Built wheel contains .libs directory
  • <env>/lib/python<version>/site-packages/vllm/.libs contains libarm_compute.so and libarm_compute_graph.so
  • readelf -d _C.abi3.so runpath points to $ORIGIN:$ORIGIN/.libs

Test Result

  • .libs folder is part of the package
  • The appropriate ACL dynamic libs are in .libs/
  • $ORIGIN:$ORIGIN/.libs are part of the rpath
  • When we run vllm the following warning is missing: WARNING 11-04 18:30:32 [interface.py:171] Failed to import from vllm._C: ImportError('libarm_compute.so: cannot open shared object file: No such file or directory')

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@Radu2k Radu2k requested a review from bigPYJ1151 as a code owner November 4, 2025 18:34
@github-actions
Copy link

github-actions bot commented Nov 4, 2025

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@mergify mergify bot added the ci/build label Nov 4, 2025
@Radu2k
Copy link
Author

Radu2k commented Nov 4, 2025

cc: @fadara01

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces changes to bundle Arm Compute Library (ACL) dynamic libraries for CPU extension builds on AArch64 and correctly configures the build to use PyTorch's libgomp. The approach is sound and the new logic for handling dynamic libraries in cmake/utils.cmake is a good, generic addition. However, I've identified two high-severity issues in the implementation. One is a potential security risk in the INSTALL_RPATH setting, and the other is a build robustness issue with file(GLOB) that could lead to incomplete installations. Addressing these will make the build process more secure and reliable.

if(UNIX AND NOT APPLE)
set_target_properties(${GPU_MOD_NAME} PROPERTIES
BUILD_RPATH "\$ORIGIN;\$ORIGIN/.libs"
INSTALL_RPATH "\$ORIGIN;\$ORIGIN/.libs;"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The INSTALL_RPATH has a trailing semicolon. This will result in an empty path component in the RPATH of the installed library (e.g., ...:$ORIGIN:$ORIGIN/.libs:). On some systems, an empty RPATH component is interpreted as the current working directory (.), which can be a security risk due to potential SO/DLL hijacking. It's best to remove the trailing semicolon to avoid this ambiguity and potential vulnerability.

      INSTALL_RPATH "\$ORIGIN;\$ORIGIN/.libs"

Comment on lines 602 to 606
foreach(_dyn_lib_path ${_dyn_install_paths})
get_filename_component(_dir "${_dyn_lib_path}" DIRECTORY)
get_filename_component(_base "${_dyn_lib_path}" NAME)
file(GLOB _selected LIST_DIRECTORIES FALSE "${_dir}/${_base}*")
list(REMOVE_DUPLICATES _selected)
install(FILES "${_selected}"
DESTINATION "${GPU_DESTINATION}/.libs"
COMPONENT ${GPU_MOD_NAME})
endforeach()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using file(GLOB) at configure time to find build artifacts for installation can lead to an incomplete package. If the scons build (which generates these libraries) changes its output files (e.g., due to a version bump), the list of files to be installed will become stale. A developer running make install without re-running cmake would then install an incorrect set of files.

A more robust approach is to perform the file discovery at install time. This can be achieved by generating a script with file(GENERATE ...) and executing it at install time with install(SCRIPT ...).

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +256 to +258
# Add ACL libraries to be linked as dynamic libraries
find_library(ACL_COMPUTE arm_compute PATHS "${ACL_LIB_DIR}" REQUIRED NO_DEFAULT_PATH)
find_library(ACL_COMPUTE_GRAPH arm_compute_graph PATHS "${ACL_LIB_DIR}" REQUIRED NO_DEFAULT_PATH)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Initialize ACL_LIB_DIR when ACL_ROOT_DIR is provided

The new find_library calls expect ACL_LIB_DIR to point to the directory that contains libarm_compute*.so. However, ACL_LIB_DIR is only assigned inside the branch where ACL is fetched from GitHub. When a caller provides ACL_ROOT_DIR via the environment (the documented override supported before this commit), the variable stays empty and the REQUIRED find_library invocations fail because PATHS "" with NO_DEFAULT_PATH searches nothing. This breaks builds that rely on a pre-downloaded or prebuilt ACL tree. Make sure ACL_LIB_DIR is derived from $ENV{ACL_ROOT_DIR} in both code paths before resolving the libraries.

Useful? React with 👍 / 👎.

# Build oneDNN for GEMM kernels (only for x86-AVX512 /ARM platforms)
if ((AVX512_FOUND AND NOT AVX512_DISABLED) OR (ASIMD_FOUND AND NOT APPLE_SILICON_FOUND) OR POWER9_FOUND OR POWER10_FOUND OR POWER11_FOUND)
# Set number of parallel build processes
include(ProcessorCount)
Copy link
Contributor

@fadara01 fadara01 Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put these changes under ASIMD_FOUND as x86 might not use libgomp in pytorch (i.e. they might be using libiomp or something else).

endmacro()

#
# Extract unique CUDA architectures from a list of compute capabilities codes in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of getting rid of white space changes in this PR

# and create a local shim dir with it
vllm_prepare_torch_gomp_shim(VLLM_TORCH_GOMP_SHIM_DIR)

find_library(OPEN_MP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you elaborate on why any libgomp related changes are needed here?
what problem do they address?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid any undefined behaviour from using OMP, oneDNN and ACL both should use the same omp version as PyTorch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atm ACL will load pytorch's libgomp as per #27415
oneDNN has to be already linking against pytorch's libgomp atm, because that's the only libgomp that I see being loaded.
Please let me know if that's not already the case

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is mostly true.

You get oneDNN static library (libdnnl.a) compiled against compile time omp version which now we don't set it to pytorch's version. If runtime OMP is not ABI-compatible with compile time OMP, you can get link failures etc.

Copy link
Contributor

@fadara01 fadara01 Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If runtime OMP is not ABI-compatible with compile time OMP

Yeah, we should make sure that pytorch's libgomp is used for both compile time and runtime, is that not already happening?

Copy link
Author

@Radu2k Radu2k Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any omp specific variable being set for oneDNN which implies that it will use the system path. We specify pytorch omp solely for ACL. By adding it to LD_LIBRARY_PATH we make sure that oneDNN is going to pick up the version from VLLM_TORCH_GOMP_SHIM_DIR before checking system paths as well.

@Radu2k Radu2k force-pushed the bundle-acl-dynlibs-torch branch from f721fed to 2181080 Compare November 5, 2025 16:16
@Radu2k
Copy link
Author

Radu2k commented Nov 5, 2025

@bigPYJ1151 Is support for dynamic linking of libraries something of interest for vLLM?

@bigPYJ1151
Copy link
Member

@bigPYJ1151 Is support for dynamic linking of libraries something of interest for vLLM?

Hi @Radu2k You want to link OpenMP in vllm __init__? It's good if it can work well.

@Radu2k
Copy link
Author

Radu2k commented Nov 6, 2025

Hi @bigPYJ1151 There are 2 key points here:

  1. We have Arm Compute Library(ACL) that we heavily rely on for performace for AArch64. How we integrate this usually, is via oneDNN by dynamically linking ACL. This implies that at runtime we are dependent on the arm_compute.so and arm_compute_graph.so. I see that for now vLLM does not have dynamic linked support in define_gpu_extension_target (as added in this PR). Enabling this would require a libs folder as part of the wheel where vLLM would host dynamic linked libs that are not part of torch package or other env packages. If this would be useful in the near future we are more than happy to add this functionality in this PR. Otherwise we can statically link ACL to the oneDNN build to be less intrusive.

  2. As torch, oneDNN and ACL rely on OMP, the best scenario is to use the same OMP so version across the board so we
    don't see any undefined behaviour at runtime, as mentioned before in the issue by Fadi, so we handle that as part of the PR as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Missing libarm_compute.so in Arm CPU pip installed wheels

3 participants