Skip to content

Commit 2b74d67

Browse files
authored
[NOID] Fixes #4124: Better document output of vector db procedures (#4154) (#4274)
1 parent 92928f4 commit 2b74d67

File tree

4 files changed

+338
-0
lines changed

4 files changed

+338
-0
lines changed

docs/asciidoc/modules/ROOT/pages/database-integration/vectordb/chroma.adoc

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,13 +44,20 @@ With hostOrKey=null, the default is 'http://localhost:8000'.
4444
CALL apoc.vectordb.chroma.createCollection($host, 'test_collection', 'Cosine', 4, {<optional config>})
4545
----
4646

47+
.Example results
48+
[opts="header"]
49+
|===
50+
| name | metadata | database | id | tenant
51+
| test_collection | {"size": 4, "hnsw:space": "cosine"} | default_database | 9c046861-f46f-417d-bd01-ca8c9f99aee5 | default_tenant
52+
|===
4753

4854
.Delete a collection (it leverages https://docs.trychroma.com/usage-guide#creating-inspecting-and-deleting-collections[this API])
4955
[source,cypher]
5056
----
5157
CALL apoc.vectordb.chroma.deleteCollection($host, '<collection_id>', {<optional config>})
5258
----
5359

60+
which returns an empty result.
5461

5562
.Upsert vectors (it leverages https://docs.trychroma.com/usage-guide#adding-data-to-a-collection[this API])
5663
[source,cypher]
@@ -63,6 +70,7 @@ CALL apoc.vectordb.qdrant.upsert($host, '<collection_id>',
6370
{<optional config>})
6471
----
6572

73+
which returns an empty result.
6674

6775
.Get vectors (it leverages https://docs.trychroma.com/usage-guide#querying-a-collection[this API])
6876
[source,cypher]
@@ -149,9 +157,12 @@ CALL apoc.vectordb.chroma.query($host, '<collection_id>',
149157

150158

151159

160+
which returns a string that answers the `$question` by leveraging the embeddings of the db vector.
161+
152162
.Delete vectors (it leverages https://docs.trychroma.com/usage-guide#deleting-data-from-a-collection[this API])
153163
[source,cypher]
154164
----
155165
CALL apoc.vectordb.chroma.delete($host, '<collection_id>', [1,2], {<optional config>})
156166
----
157167

168+
which returns an array of strings of deleted ids. For example, `["1", "2"]`
Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
2+
= Milvus
3+
4+
Here is a list of all available Milvus procedures:
5+
6+
[opts=header, cols="1, 3"]
7+
|===
8+
| name | description
9+
| apoc.vectordb.milvus.createCollection(hostOrKey, collection, similarity, size, $config) |
10+
Creates a collection, with the name specified in the 2nd parameter, and with the specified `similarity` and `size`.
11+
The default endpoint is `<hostOrKey param>/v2/vectordb/collections/create`.
12+
| apoc.vectordb.milvus.deleteCollection(hostOrKey, collection, $config) |
13+
Deletes a collection with the name specified in the 2nd parameter.
14+
The default endpoint is `<hostOrKey param>/v2/vectordb/collections/drop`.
15+
| apoc.vectordb.milvus.upsert(hostOrKey, collection, vectors, $config) |
16+
Upserts, in the collection with the name specified in the 2nd parameter, the vectors [{id: 'id', vector: '<vectorDb>', medatada: '<metadata>'}].
17+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/upsert`.
18+
| apoc.vectordb.milvus.delete(hostOrKey, collection, ids, $config) |
19+
Delete the vectors with the specified `ids`.
20+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/delete`.
21+
| apoc.vectordb.milvus.get(hostOrKey, collection, ids, $config) |
22+
Get the vectors with the specified `ids`.
23+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/get`.
24+
| apoc.vectordb.milvus.query(hostOrKey, collection, vector, filter, limit, $config) |
25+
Retrieve closest vectors the the defined `vector`, `limit` of results, in the collection with the name specified in the 2nd parameter.
26+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/search`.
27+
| apoc.vectordb.milvus.getAndUpdate(hostOrKey, collection, ids, $config) |
28+
Get the vectors with the specified `ids`.
29+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/get`, and optionally creates/updates neo4j entities.
30+
| apoc.vectordb.milvus.queryAndUpdate(hostOrKey, collection, vector, filter, limit, $config) |
31+
Retrieve closest vectors the the defined `vector`, `limit` of results, in the collection with the name specified in the 2nd parameter, and optionally creates/updates neo4j entities.
32+
The default endpoint is `<hostOrKey param>/v2/vectordb/entities/search`.
33+
|===
34+
35+
where the 1st parameter can be a key defined by the apoc config `apoc.milvus.<key>.host=myHost`.
36+
With hostOrKey=null, the default host is 'http://localhost:19530'.
37+
38+
== Examples
39+
40+
Here is a list of example using a local installation using th default port `19531`.
41+
42+
43+
.Create a collection (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Collection%20(v2)/Create.md[this API])
44+
[source,cypher]
45+
----
46+
CALL apoc.vectordb.milvus.createCollection('http://localhost:19531', 'test_collection', 'COSINE', 4, {<optional config>})
47+
----
48+
49+
.Example results
50+
[opts="header"]
51+
|===
52+
| data | code
53+
| null | 200
54+
|===
55+
56+
.Delete a collection (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Collection%20(v2)/Drop.md[this API])
57+
[source,cypher]
58+
----
59+
CALL apoc.vectordb.milvus.deleteCollection('http://localhost:19531', 'test_collection', {<optional config>})
60+
----
61+
62+
.Example results
63+
[opts="header"]
64+
|===
65+
| data | code
66+
| null | 200
67+
|===
68+
69+
70+
.Upsert vectors (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Upsert.md[this API])
71+
[source,cypher]
72+
----
73+
CALL apoc.vectordb.milvus.upsert('http://localhost:19531', 'test_collection',
74+
[
75+
{id: 1, vector: [0.05, 0.61, 0.76, 0.74], metadata: {city: "Berlin", foo: "one"}},
76+
{id: 2, vector: [0.19, 0.81, 0.75, 0.11], metadata: {city: "London", foo: "two"}}
77+
],
78+
{<optional config>})
79+
----
80+
81+
.Example results
82+
[opts="header"]
83+
|===
84+
| data | code
85+
| {"upsertCount": 2, "upsertId": [1, 2]} | 200
86+
|===
87+
88+
89+
.Get vectors (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Get.md[this API])
90+
[source,cypher]
91+
----
92+
CALL apoc.vectordb.milvus.get('http://localhost:19531', 'test_collection', [1,2], {<optional config>})
93+
----
94+
95+
96+
.Example results
97+
[opts="header"]
98+
|===
99+
| score | metadata | id | vector | text | entity
100+
| null | {city: "Berlin", foo: "one"} | null | null | null | null
101+
| null | {city: "Berlin", foo: "two"} | null | null | null | null
102+
| ...
103+
|===
104+
105+
.Get vectors with `{allResults: true}`
106+
[source,cypher]
107+
----
108+
CALL apoc.vectordb.milvus.get('http://localhost:19531', 'test_collection', [1,2], {allResults: true, <optional config>})
109+
----
110+
111+
112+
.Example results
113+
[opts="header"]
114+
|===
115+
| score | metadata | id | vector | text | entity
116+
| null | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
117+
| null | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
118+
| ...
119+
|===
120+
121+
.Query vectors (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Query.md[this API])
122+
[source,cypher]
123+
----
124+
CALL apoc.vectordb.milvus.query('http://localhost:19531',
125+
'test_collection',
126+
[0.2, 0.1, 0.9, 0.7],
127+
{ must:
128+
[ { key: "city", match: { value: "London" } } ]
129+
},
130+
5,
131+
{allResults: true, <optional config>})
132+
----
133+
134+
135+
.Example results
136+
[opts="header"]
137+
|===
138+
| score | metadata | id | vector | text | entity
139+
| 1, | {city: "Berlin", foo: "one"} | 1 | [...] | null | null
140+
| 0.1 | {city: "Berlin", foo: "two"} | 2 | [...] | null | null
141+
| ...
142+
|===
143+
144+
145+
We can define a mapping, to auto-create one/multiple nodes and relationships, by leveraging the vector metadata.
146+
147+
For example, if we have created 2 vectors with the above upsert procedures,
148+
we can populate some existing nodes (i.e. `(:Test {myId: 'one'})` and `(:Test {myId: 'two'})`):
149+
150+
151+
[source,cypher]
152+
----
153+
CALL apoc.vectordb.milvus.queryAndUpdate('http://localhost:19531', 'test_collection',
154+
[0.2, 0.1, 0.9, 0.7],
155+
{},
156+
5,
157+
{ mapping: {
158+
embeddingKey: "vect",
159+
nodeLabel: "Test",
160+
entityKey: "myId",
161+
metadataKey: "foo"
162+
}
163+
})
164+
----
165+
166+
which populates the two nodes as: `(:Test {myId: 'one', city: 'Berlin', vect: [vector1]})` and `(:Test {myId: 'two', city: 'London', vect: [vector2]})`,
167+
which will be returned in the `entity` column result.
168+
169+
170+
We can also set the mapping configuration `mode` to `CREATE_IF_MISSING` (which creates nodes if not exist), `READ_ONLY` (to search for nodes/rels, without making updates) or `UPDATE_EXISTING` (default behavior):
171+
172+
[source,cypher]
173+
----
174+
CALL apoc.vectordb.milvus.queryAndUpdate('http://localhost:19531', 'test_collection',
175+
[0.2, 0.1, 0.9, 0.7],
176+
{},
177+
5,
178+
{ mapping: {
179+
mode: "CREATE_IF_MISSING",
180+
embeddingKey: "vect",
181+
nodeLabel: "Test",
182+
entityKey: "myId",
183+
metadataKey: "foo"
184+
}
185+
})
186+
----
187+
188+
which creates and 2 new nodes as above.
189+
190+
Or, we can populate an existing relationship (i.e. `(:Start)-[:TEST {myId: 'one'}]->(:End)` and `(:Start)-[:TEST {myId: 'two'}]->(:End)`):
191+
192+
193+
[source,cypher]
194+
----
195+
CALL apoc.vectordb.milvus.queryAndUpdate('http://localhost:19531', 'test_collection',
196+
[0.2, 0.1, 0.9, 0.7],
197+
{},
198+
5,
199+
{ mapping: {
200+
embeddingKey: "vect",
201+
relType: "TEST",
202+
entityKey: "myId",
203+
metadataKey: "foo"
204+
}
205+
})
206+
----
207+
208+
which populates the two relationships as: `()-[:TEST {myId: 'one', city: 'Berlin', vect: [vector1]}]-()`
209+
and `()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-()`,
210+
which will be returned in the `entity` column result.
211+
212+
213+
We can also use mapping for `apoc.vectordb.milvus.query` procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates
214+
(i.e. equivalent to `*.queryOrUpdate` procedure with mapping config having `mode: "READ_ONLY"`).
215+
216+
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column `rel`:
217+
218+
[source,cypher]
219+
----
220+
CALL apoc.vectordb.milvus.query('http://localhost:19531', 'test_collection',
221+
[0.2, 0.1, 0.9, 0.7],
222+
{},
223+
5,
224+
{ mapping: {
225+
embeddingKey: "vect",
226+
relType: "TEST",
227+
entityKey: "myId",
228+
metadataKey: "foo"
229+
}
230+
})
231+
----
232+
233+
[NOTE]
234+
====
235+
We can use mapping with `apoc.vectordb.milvus.get*` procedures as well
236+
====
237+
238+
[NOTE]
239+
====
240+
To optimize performances, we can choose what to `YIELD` with the `apoc.vectordb.milvus.query*` and the `apoc.vectordb.milvus.get*` procedures.
241+
242+
For example, by executing a `CALL apoc.vectordb.milvus.query(...) YIELD metadata, score, id`, the RestAPI request will have an {"with_payload": false, "with_vectors": false},
243+
so that we do not return the other values that we do not need.
244+
====
245+
246+
It is possible to execute vector db procedures together with the xref::ml/rag.adoc[apoc.ml.rag] as follow:
247+
248+
[source,cypher]
249+
----
250+
CALL apoc.vectordb.milvus.getAndUpdate($host, $collection, [<id1>, <id2>], $conf) YIELD node, metadata, id, vector
251+
WITH collect(node) as paths
252+
CALL apoc.ml.rag(paths, $attributes, $question, $confPrompt) YIELD value
253+
RETURN value
254+
----
255+
256+
which returns a string that answers the `$question` by leveraging the embeddings of the db vector.
257+
258+
.Delete vectors (it leverages https://milvus.io/api-reference/restful/v2.4.x/v2/Vector%20(v2)/Delete.md[this API])
259+
[source,cypher]
260+
----
261+
CALL apoc.vectordb.milvus.delete('http://localhost:19531', 'test_collection', [1,2], {<optional config>})
262+
----
263+
264+
.Example results
265+
[opts="header"]
266+
|===
267+
| data | code
268+
| null | 200
269+
|===

docs/asciidoc/modules/ROOT/pages/database-integration/vectordb/qdrant.adoc

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,13 +45,25 @@ With hostOrKey=null, the default is 'http://localhost:6333'.
4545
CALL apoc.vectordb.qdrant.createCollection($hostOrKey, 'test_collection', 'Cosine', 4, {<optional config>})
4646
----
4747

48+
.Example results
49+
[opts="header"]
50+
|===
51+
| result | time | status
52+
| true | 0.094182458 | "ok"
53+
|===
4854

4955
.Delete a collection (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/collections/operation/delete_collection[this API])
5056
[source,cypher]
5157
----
5258
CALL apoc.vectordb.qdrant.deleteCollection($hostOrKey, 'test_collection', {<optional config>})
5359
----
5460

61+
.Example results
62+
[opts="header"]
63+
|===
64+
| result | time | status
65+
| true | 0.094182458 | "ok"
66+
|===
5567

5668
.Upsert vectors (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/upsert_points[this API])
5769
[source,cypher]
@@ -64,6 +76,12 @@ CALL apoc.vectordb.qdrant.upsert($hostOrKey, 'test_collection',
6476
{<optional config>})
6577
----
6678

79+
.Example results
80+
[opts="header"]
81+
|===
82+
| result | time | status
83+
| {"result": { "operation_id": 0, "status": "acknowledged" } } | 0.094182458 | "ok"
84+
|===
6785

6886
.Get vectors (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/get_points[this API])
6987
[source,cypher]
@@ -202,8 +220,17 @@ so that we do not return the other values that we do not need.
202220

203221

204222

223+
which returns a string that answers the `$question` by leveraging the embeddings of the db vector.
224+
205225
.Delete vectors (it leverages https://qdrant.github.io/qdrant/redoc/index.html#tag/points/operation/delete_vectors[this API])
206226
[source,cypher]
207227
----
208228
CALL apoc.vectordb.qdrant.delete($hostOrKey, 'test_collection', [1,2], {<optional config>})
209229
----
230+
231+
.Example results
232+
[opts="header"]
233+
|===
234+
| result | time | status
235+
| {"result": { "operation_id": 2, "status": "acknowledged" } } | 0.094182458 | "ok"
236+
|===

0 commit comments

Comments
 (0)