|
11 | 11 |
|
12 | 12 | ## Bumping llama.cpp version |
13 | 13 |
|
14 | | -1. Pull and checkout the desired llama.cpp version: |
| 14 | +### Overview |
15 | 15 |
|
16 | | -``` |
17 | | -pushd vendor/llama.cpp |
18 | | -git fetch origin |
19 | | -git checkout <desired llama.cpp sha> # usually we bump to the latest tagged commit |
20 | | -popd |
21 | | -``` |
| 16 | +This project uses llama.cpp as a git submodule located at `vendor/llama.cpp`, which points to the official llama.cpp repository at https://github.com/ggml-org/llama.cpp.git. |
22 | 17 |
|
23 | | -2. Apply our llama-server patch: |
| 18 | +The project applies custom patches to llama.cpp's server implementation (`server.cpp` and `utils.hpp`) to integrate with the Docker model-runner architecture. These patches are maintained in `src/server/server.patch`. |
24 | 19 |
|
| 20 | +### Prerequisites |
| 21 | + |
| 22 | +Before bumping the version, ensure the submodule is initialized. **Run this command from the project root directory** (e.g., `/path/to/model-runner`): |
| 23 | + |
| 24 | +```bash |
| 25 | +# From the project root directory |
| 26 | +git submodule update --init --recursive |
25 | 27 | ``` |
26 | | -make -C src/server clean |
27 | | -make -C src/server |
28 | | -``` |
29 | 28 |
|
30 | | -3. Make sure everyting builds cleanly following the update. |
| 29 | +If the submodule is already initialized, this command is safe to run and will ensure it's up to date. |
| 30 | + |
| 31 | +**Note:** This command must be executed from the repository root because it needs to access the `.gitmodules` file and the submodule paths are relative to the root directory. |
| 32 | + |
| 33 | +### Step-by-Step Process |
| 34 | + |
| 35 | +1. **Find the desired llama.cpp version:** |
| 36 | + |
| 37 | + Visit https://github.com/ggml-org/llama.cpp/releases to find the latest stable release or a specific version you want to use. We typically bump to the latest tagged commit (e.g., `b1234`, `b2345`, etc.). |
| 38 | + |
| 39 | +2. **Update the submodule to the desired version:** |
| 40 | + |
| 41 | + ```bash |
| 42 | + pushd vendor/llama.cpp |
| 43 | + git fetch origin |
| 44 | + git checkout <desired llama.cpp sha> # usually we bump to the latest tagged commit |
| 45 | + popd |
| 46 | + ``` |
| 47 | + |
| 48 | +3. **Apply the custom llama-server patch:** |
| 49 | + |
| 50 | + ```bash |
| 51 | + make -C src/server clean |
| 52 | + make -C src/server |
| 53 | + ``` |
| 54 | + |
| 55 | + This will: |
| 56 | + - Clean the previous patched files |
| 57 | + - Copy the new `server.cpp` and `utils.hpp` from the updated llama.cpp |
| 58 | + - Apply our custom patches from `src/server/server.patch` |
| 59 | + |
| 60 | +4. **Build and test:** |
| 61 | + |
| 62 | + ```bash |
| 63 | + # Build from the native directory |
| 64 | + cmake -B build |
| 65 | + cmake --build build --parallel 8 --config Release |
| 66 | + |
| 67 | + # Test the build |
| 68 | + ./build/bin/com.docker.llama-server --model <path to model> |
| 69 | + ``` |
| 70 | + |
| 71 | + Make sure everything builds cleanly without errors. |
| 72 | + |
| 73 | +5. **Commit the submodule update:** |
| 74 | + |
| 75 | + ```bash |
| 76 | + git add vendor/llama.cpp |
| 77 | + git commit -m "Bump llama.cpp to <version>" |
| 78 | + ``` |
0 commit comments