Skip to content

Commit 9b7fb1e

Browse files
committed
add bloom_filter module
1 parent 95e266b commit 9b7fb1e

File tree

3 files changed

+613
-0
lines changed

3 files changed

+613
-0
lines changed
Lines changed: 306 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,306 @@
1+
# core.base.bloom_filter
2+
3+
This module provides a Bloom filter implementation, a probabilistic data structure used to test whether an element is a member of a set.
4+
5+
::: tip TIP
6+
To use this module, you need to import it first: `import("core.base.bloom_filter")`
7+
:::
8+
9+
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not. This makes it useful for checking membership with high probability and minimal memory usage.
10+
11+
## bloom_filter.new
12+
13+
- Create a new Bloom filter
14+
15+
#### Function Prototype
16+
17+
::: tip API
18+
```lua
19+
bloom_filter.new(opt?: <table>)
20+
```
21+
:::
22+
23+
#### Parameter Description
24+
25+
| Parameter | Description |
26+
|-----------|-------------|
27+
| opt | Optional. Configuration options:<br>- `probability` - False positive probability (default: 0.001)<br>- `hash_count` - Number of hash functions (default: 3)<br>- `item_maxn` - Maximum number of items (default: 1000000) |
28+
29+
#### Return Value
30+
31+
| Type | Description |
32+
|------|-------------|
33+
| bloom_filter | Returns a bloom filter instance |
34+
| nil, string | Returns nil and error message on failure |
35+
36+
#### Usage
37+
38+
```lua
39+
import("core.base.bloom_filter")
40+
41+
-- Create a new bloom filter with default settings
42+
local filter = bloom_filter.new()
43+
44+
-- Create with custom settings
45+
local filter = bloom_filter.new({
46+
probability = 0.001,
47+
hash_count = 3,
48+
item_maxn = 1000000
49+
})
50+
```
51+
52+
::: tip TIP
53+
Supported probability values: 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001
54+
:::
55+
56+
## filter:set
57+
58+
- Add an item to the Bloom filter
59+
60+
#### Function Prototype
61+
62+
::: tip API
63+
```lua
64+
filter:set(item: <string>)
65+
```
66+
:::
67+
68+
#### Parameter Description
69+
70+
| Parameter | Description |
71+
|-----------|-------------|
72+
| item | Required. String item to add to the filter |
73+
74+
#### Return Value
75+
76+
| Type | Description |
77+
|------|-------------|
78+
| boolean | Returns true if item was successfully added, false if it already exists |
79+
80+
#### Usage
81+
82+
```lua
83+
import("core.base.bloom_filter")
84+
85+
local filter = bloom_filter.new()
86+
87+
-- Add items to the filter
88+
if filter:set("hello") then
89+
print("Item added successfully")
90+
end
91+
92+
-- Note: false positives are possible
93+
if filter:set("hello") then
94+
print("This won't print - item already exists")
95+
else
96+
print("Item already exists (may be false positive)")
97+
end
98+
```
99+
100+
## filter:get
101+
102+
- Check if an item exists in the Bloom filter
103+
104+
#### Function Prototype
105+
106+
::: tip API
107+
```lua
108+
filter:get(item: <string>)
109+
```
110+
:::
111+
112+
#### Parameter Description
113+
114+
| Parameter | Description |
115+
|-----------|-------------|
116+
| item | Required. String item to check |
117+
118+
#### Return Value
119+
120+
| Type | Description |
121+
|------|-------------|
122+
| boolean | Returns true if item exists, false otherwise |
123+
124+
#### Usage
125+
126+
```lua
127+
import("core.base.bloom_filter")
128+
129+
local filter = bloom_filter.new()
130+
131+
-- Add some items
132+
filter:set("hello")
133+
filter:set("xmake")
134+
135+
-- Check for items
136+
if filter:get("hello") then
137+
print("hello exists")
138+
end
139+
140+
if filter:get("not exists") then
141+
print("This won't print")
142+
else
143+
print("Item not found")
144+
end
145+
```
146+
147+
::: warning NOTE
148+
Bloom filters can produce false positives (claiming an item exists when it doesn't), but never false negatives (claiming an item doesn't exist when it does).
149+
:::
150+
151+
## filter:data
152+
153+
- Get the Bloom filter data as a bytes object
154+
155+
#### Function Prototype
156+
157+
::: tip API
158+
```lua
159+
filter:data()
160+
```
161+
:::
162+
163+
#### Parameter Description
164+
165+
No parameters
166+
167+
#### Return Value
168+
169+
| Type | Description |
170+
|------|-------------|
171+
| bytes | Returns the filter data as a bytes object |
172+
| nil, string | Returns nil and error message on failure |
173+
174+
#### Usage
175+
176+
```lua
177+
import("core.base.bloom_filter")
178+
179+
local filter = bloom_filter.new()
180+
filter:set("hello")
181+
filter:set("xmake")
182+
183+
-- Get the filter data
184+
local data = filter:data()
185+
186+
-- Save or transfer the data
187+
print("Filter size:", data:size())
188+
```
189+
190+
## filter:data_set
191+
192+
- Set the Bloom filter data from a bytes object
193+
194+
#### Function Prototype
195+
196+
::: tip API
197+
```lua
198+
filter:data_set(data: <bytes>)
199+
```
200+
:::
201+
202+
#### Parameter Description
203+
204+
| Parameter | Description |
205+
|-----------|-------------|
206+
| data | Required. Bytes object containing the filter data |
207+
208+
#### Return Value
209+
210+
| Type | Description |
211+
|------|-------------|
212+
| boolean | Returns true if data was set successfully |
213+
214+
#### Usage
215+
216+
```lua
217+
import("core.base.bloom_filter")
218+
219+
-- Create first filter and add items
220+
local filter1 = bloom_filter.new()
221+
filter1:set("hello")
222+
filter1:set("xmake")
223+
224+
-- Get data from first filter
225+
local data = filter1:data()
226+
227+
-- Create second filter and load data
228+
local filter2 = bloom_filter.new()
229+
filter2:data_set(data)
230+
231+
-- Both filters now contain the same items
232+
assert(filter2:get("hello") == true)
233+
assert(filter2:get("xmake") == true)
234+
```
235+
236+
## filter:clear
237+
238+
- Clear all items from the Bloom filter
239+
240+
#### Function Prototype
241+
242+
::: tip API
243+
```lua
244+
filter:clear()
245+
```
246+
:::
247+
248+
#### Parameter Description
249+
250+
No parameters
251+
252+
#### Return Value
253+
254+
| Type | Description |
255+
|------|-------------|
256+
| boolean | Returns true on success, false on failure |
257+
258+
#### Usage
259+
260+
```lua
261+
import("core.base.bloom_filter")
262+
263+
local filter = bloom_filter.new()
264+
filter:set("hello")
265+
filter:set("xmake")
266+
267+
-- Clear all items
268+
filter:clear()
269+
270+
-- Items are now removed
271+
assert(filter:get("hello") == false)
272+
assert(filter:get("xmake") == false)
273+
```
274+
275+
## filter:cdata
276+
277+
- Get the internal C data handle
278+
279+
#### Function Prototype
280+
281+
::: tip API
282+
```lua
283+
filter:cdata()
284+
```
285+
:::
286+
287+
#### Parameter Description
288+
289+
No parameters
290+
291+
#### Return Value
292+
293+
| Type | Description |
294+
|------|-------------|
295+
| userdata | Returns the internal C data handle |
296+
297+
#### Usage
298+
299+
```lua
300+
import("core.base.bloom_filter")
301+
302+
local filter = bloom_filter.new()
303+
local cdata = filter:cdata()
304+
print("C data handle:", cdata)
305+
```
306+

docs/sidebar.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ function coreBaseModulesApiSidebar(): DefaultTheme.SidebarItem {
6363
collapsed: true,
6464
items: [
6565
{ text: 'bit', link: 'extension-modules/core/base/bit' },
66+
{ text: 'bloom_filter', link: 'extension-modules/core/base/bloom_filter' },
6667
{ text: 'bytes', link: 'extension-modules/core/base/bytes' },
6768
{ text: 'cpu', link: 'extension-modules/core/base/cpu' },
6869
{ text: 'global', link: 'extension-modules/core/base/global' },

0 commit comments

Comments
 (0)