Skip to content

Commit 10e184e

Browse files
Bump llama version to c22473b580807929fd9e3a3344a48e8cfbe6c88f (#355)
* feat(llama.cpp): update README and server implementation for token handling * Apply suggestion from @gemini-code-assist[bot] Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
1 parent ad31658 commit 10e184e

File tree

4 files changed

+242
-189
lines changed

4 files changed

+242
-189
lines changed

llamacpp/native/README.md

Lines changed: 60 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,20 +11,68 @@
1111

1212
## Bumping llama.cpp version
1313

14-
1. Pull and checkout the desired llama.cpp version:
14+
### Overview
1515

16-
```
17-
pushd vendor/llama.cpp
18-
git fetch origin
19-
git checkout <desired llama.cpp sha> # usually we bump to the latest tagged commit
20-
popd
21-
```
16+
This project uses llama.cpp as a git submodule located at `vendor/llama.cpp`, which points to the official llama.cpp repository at https://github.com/ggml-org/llama.cpp.git.
2217

23-
2. Apply our llama-server patch:
18+
The project applies custom patches to llama.cpp's server implementation (`server.cpp` and `utils.hpp`) to integrate with the Docker model-runner architecture. These patches are maintained in `src/server/server.patch`.
2419

20+
### Prerequisites
21+
22+
Before bumping the version, ensure the submodule is initialized. **Run this command from the project root directory** (e.g., `/path/to/model-runner`):
23+
24+
```bash
25+
# From the project root directory
26+
git submodule update --init --recursive
2527
```
26-
make -C src/server clean
27-
make -C src/server
28-
```
2928

30-
3. Make sure everyting builds cleanly following the update.
29+
If the submodule is already initialized, this command is safe to run and will ensure it's up to date.
30+
31+
**Note:** This command must be executed from the repository root because it needs to access the `.gitmodules` file and the submodule paths are relative to the root directory.
32+
33+
### Step-by-Step Process
34+
35+
1. **Find the desired llama.cpp version:**
36+
37+
Visit https://github.com/ggml-org/llama.cpp/releases to find the latest stable release or a specific version you want to use. We typically bump to the latest tagged commit (e.g., `b1234`, `b2345`, etc.).
38+
39+
2. **Update the submodule to the desired version:**
40+
41+
```bash
42+
pushd vendor/llama.cpp
43+
git fetch origin
44+
git checkout <desired llama.cpp sha> # usually we bump to the latest tagged commit
45+
popd
46+
```
47+
48+
3. **Apply the custom llama-server patch:**
49+
50+
```bash
51+
make -C src/server clean
52+
make -C src/server
53+
```
54+
55+
This will:
56+
- Clean the previous patched files
57+
- Copy the new `server.cpp` and `utils.hpp` from the updated llama.cpp
58+
- Apply our custom patches from `src/server/server.patch`
59+
60+
4. **Build and test:**
61+
62+
```bash
63+
# Build from the native directory
64+
cmake -B build
65+
cmake --build build --parallel 8 --config Release
66+
67+
# Test the build
68+
./build/bin/com.docker.llama-server --model <path to model>
69+
```
70+
71+
Make sure everything builds cleanly without errors.
72+
73+
5. **Commit the submodule update:**
74+
75+
```bash
76+
git add vendor/llama.cpp
77+
git commit -m "Bump llama.cpp to <version>"
78+
```

0 commit comments

Comments
 (0)