Skip to content

Commit 2141c89

Browse files
Merge branch 'master' into resume-reprovide-cycle
2 parents fc32e72 + 16479ec commit 2141c89

File tree

11 files changed

+1462
-242
lines changed

11 files changed

+1462
-242
lines changed

core/commands/provide.go

Lines changed: 425 additions & 37 deletions
Large diffs are not rendered by default.

docs/changelogs/v0.39.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ This release was brought to you by the [Shipyard](https://ipshipyard.com/) team.
1010

1111
- [Overview](#overview)
1212
- [🔦 Highlights](#-highlights)
13+
- [📊 Detailed statistics for Sweep provider with `ipfs provide stat`](#-detailed-statistics-for-sweep-provider-with-ipfs-provide-stat)
1314
- [Provider resume cycle for improved reproviding reliability](#provider-resume-cycle-for-improved-reproviding-reliability)
1415
- [🪦 Deprecated `go-ipfs` name no longer published](#-deprecated-go-ipfs-name-no-longer-published)
1516
- [📦️ Important dependency updates](#-important-dependency-updates)
@@ -20,6 +21,47 @@ This release was brought to you by the [Shipyard](https://ipshipyard.com/) team.
2021

2122
### 🔦 Highlights
2223

24+
#### 📊 Detailed statistics for Sweep provider with `ipfs provide stat`
25+
26+
The experimental Sweep provider system ([introduced in
27+
v0.38](https://github.com/ipfs/kubo/blob/master/docs/changelogs/v0.38.md#-experimental-sweeping-dht-provider))
28+
now has detailed statistics available through `ipfs provide stat`.
29+
30+
These statistics help you monitor provider health and troubleshoot issues,
31+
especially useful for nodes providing large content collections. You can quickly
32+
identify bottlenecks like queue backlog, worker saturation, or connectivity
33+
problems that might prevent content from being announced to the DHT.
34+
35+
**Default behavior:** Displays a brief summary showing queue sizes, scheduled
36+
CIDs/regions, average record holders, ongoing/total provides, and worker status
37+
when resources are constrained.
38+
39+
**Detailed statistics with `--all`:** View complete metrics organized into sections:
40+
41+
- **Connectivity**: DHT connection status
42+
- **Queues**: Pending provide and reprovide operations
43+
- **Schedule**: CIDs/regions scheduled for reprovide
44+
- **Timings**: Uptime, reprovide cycle information
45+
- **Network**: Peer statistics, keyspace region sizes
46+
- **Operations**: Ongoing and past provides, rates, errors
47+
- **Workers**: Worker pool utilization and availability
48+
49+
**Real-time monitoring:** For continuous monitoring, run
50+
`watch ipfs provide stat --all --compact` to see detailed statistics refreshed
51+
in a 2-column layout. This lets you observe provide rates, queue sizes, and
52+
worker availability in real-time. Individual sections can be displayed using
53+
flags like `--network`, `--operations`, or `--workers`, and multiple flags can
54+
be combined for custom views.
55+
56+
**Dual DHT support:** For Dual DHT configurations, use `--lan` to view LAN DHT
57+
provider statistics instead of the default WAN DHT stats.
58+
59+
> [!NOTE]
60+
> These statistics are only available when using the Sweep provider system
61+
> (enabled via
62+
> [`Provide.DHT.SweepEnabled`](https://github.com/ipfs/kubo/blob/master/docs/config.md#providedhtsweepenabled)).
63+
> Legacy provider shows basic statistics without flag support.
64+
2365
#### Provider resume cycle for improved reproviding reliability
2466

2567
When using the sweeping provider (`Provide.DHT.SweepEnabled`), Kubo now

docs/config.md

Lines changed: 102 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1911,10 +1911,17 @@ Type: `duration`
19111911

19121912
## `Provide`
19131913

1914-
Configures CID announcements to the routing system, including both immediate
1915-
announcements for new content (provide) and periodic re-announcements
1916-
(reprovide) on systems that require it, like Amino DHT. While designed to support
1917-
multiple routing systems in the future, the current default configuration only supports providing to the Amino DHT.
1914+
Configures how your node advertises content to make it discoverable by other
1915+
peers.
1916+
1917+
**What is providing?** When your node stores content, it publishes provider
1918+
records to the routing system announcing "I have this content". These records
1919+
map CIDs to your peer ID, enabling content discovery across the network.
1920+
1921+
While designed to support multiple routing systems in the future, the current
1922+
default configuration only supports [providing to the Amino DHT](#providedht).
1923+
1924+
<!-- TODO: See the [Reprovide Sweep blog post](https://blog.ipfs.tech/2025-reprovide-sweep/) for detailed performance comparisons. -->
19181925

19191926
### `Provide.Enabled`
19201927

@@ -1965,13 +1972,39 @@ Type: `optionalString` (unset for the default)
19651972

19661973
Configuration for providing data to Amino DHT peers.
19671974

1975+
**Provider record lifecycle:** On the Amino DHT, provider records expire after
1976+
[`amino.DefaultProvideValidity`](https://github.com/libp2p/go-libp2p-kad-dht/blob/v0.34.0/amino/defaults.go#L40-L43).
1977+
Your node must re-announce (reprovide) content periodically to keep it
1978+
discoverable. The [`Provide.DHT.Interval`](#providedhtinterval) setting
1979+
controls this timing, with the default ensuring records refresh well before
1980+
expiration or negative churn effects kick in.
1981+
1982+
**Two provider systems:**
1983+
1984+
- **Sweep provider**: Divides the DHT keyspace into regions and systematically
1985+
sweeps through them over the reprovide interval. This batches CIDs allocated
1986+
to the same DHT servers, dramatically reducing the number of DHT lookups and
1987+
PUTs needed. Spreads work evenly over time with predictable resource usage.
1988+
1989+
- **Legacy provider**: Processes each CID individually with separate DHT
1990+
lookups. Works well for small content collections but struggles to complete
1991+
reprovide cycles when managing thousands of CIDs.
1992+
19681993
#### Monitoring Provide Operations
19691994

1970-
You can monitor the effectiveness of your provide configuration through metrics exposed at the Prometheus endpoint: `{Addresses.API}/debug/metrics/prometheus` (default: `http://127.0.0.1:5001/debug/metrics/prometheus`).
1995+
**Quick command-line monitoring:** Use `ipfs provide stat` to view the current
1996+
state of the provider system. For real-time monitoring, run
1997+
`watch ipfs provide stat --all --compact` to see detailed statistics refreshed
1998+
continuously in a 2-column layout.
19711999

1972-
Different metrics are available depending on whether you use legacy mode (`SweepEnabled=false`) or sweep mode (`SweepEnabled=true`). See [Provide metrics documentation](https://github.com/ipfs/kubo/blob/master/docs/metrics.md#provide) for details.
2000+
**Long-term monitoring:** For in-depth or long-term monitoring, metrics are
2001+
exposed at the Prometheus endpoint: `{Addresses.API}/debug/metrics/prometheus`
2002+
(default: `http://127.0.0.1:5001/debug/metrics/prometheus`). Different metrics
2003+
are available depending on whether you use legacy mode (`SweepEnabled=false`) or
2004+
sweep mode (`SweepEnabled=true`). See [Provide metrics documentation](https://github.com/ipfs/kubo/blob/master/docs/metrics.md#provide)
2005+
for details.
19732006

1974-
To enable detailed debug logging for both providers, set:
2007+
**Debug logging:** For troubleshooting, enable detailed logging by setting:
19752008

19762009
```sh
19772010
GOLOG_LOG_LEVEL=error,provider=debug,dht/provider=debug
@@ -1983,12 +2016,24 @@ GOLOG_LOG_LEVEL=error,provider=debug,dht/provider=debug
19832016
#### `Provide.DHT.Interval`
19842017

19852018
Sets how often to re-announce content to the DHT. Provider records on Amino DHT
1986-
expire after [`amino.DefaultProvideValidity`](https://github.com/libp2p/go-libp2p-kad-dht/blob/v0.34.0/amino/defaults.go#L40-L43),
1987-
also known as Provider Record Expiration Interval.
2019+
expire after [`amino.DefaultProvideValidity`](https://github.com/libp2p/go-libp2p-kad-dht/blob/v0.34.0/amino/defaults.go#L40-L43).
2020+
2021+
**Why this matters:** The interval must be shorter than the expiration window to
2022+
ensure provider records refresh before they expire. The default value is
2023+
approximately half of [`amino.DefaultProvideValidity`](https://github.com/libp2p/go-libp2p-kad-dht/blob/v0.34.0/amino/defaults.go#L40-L43),
2024+
which accounts for network churn and ensures records stay alive without
2025+
overwhelming the network with unnecessary announcements.
19882026

1989-
An interval of about half the expiration window ensures provider records
1990-
are refreshed well before they expire. This keeps your content continuously
1991-
discoverable accounting for network churn without overwhelming the network with too frequent announcements.
2027+
**With sweep mode enabled
2028+
([`Provide.DHT.SweepEnabled`](#providedhtsweepenabled)):** The system spreads
2029+
reprovide operations smoothly across this entire interval. Each keyspace region
2030+
is reprovided at scheduled times throughout the period, ensuring announcements
2031+
periodically happen every interval.
2032+
2033+
**With legacy mode:** The system attempts to reprovide all CIDs as quickly as
2034+
possible at the start of each interval. If reproviding takes longer than this
2035+
interval (common with large datasets), the next cycle is skipped and provider
2036+
records may expire.
19922037

19932038
- If unset, it uses the implicit safe default.
19942039
- If set to the value `"0"` it will disable content reproviding to DHT.
@@ -2056,32 +2101,44 @@ Type: `optionalInteger` (non-negative; `0` means unlimited number of workers)
20562101

20572102
#### `Provide.DHT.SweepEnabled`
20582103

2059-
Whether Provide Sweep is enabled. If not enabled, the legacy
2060-
[`boxo/provider`](https://github.com/ipfs/boxo/tree/main/provider) is used for
2061-
both provides and reprovides.
2062-
2063-
Provide Sweep is a resource efficient technique for advertising content to
2064-
the Amino DHT swarm. The Provide Sweep module tracks the keys that should be periodically reprovided in
2065-
the `Keystore`. It splits the keys into DHT keyspace regions by proximity (XOR
2066-
distance), and schedules when reprovides should happen in order to spread the
2067-
reprovide operation over time to avoid a spike in resource utilization. It
2068-
basically sweeps the keyspace _from left to right_ over the
2069-
[`Provide.DHT.Interval`](#providedhtinterval) time period, and reprovides keys
2070-
matching to the visited keyspace region.
2071-
2072-
Provide Sweep aims at replacing the inefficient legacy `boxo/provider`
2073-
module, and is currently opt-in. You can compare the effectiveness of sweep mode vs legacy mode by monitoring the appropriate metrics (see [Monitoring Provide Operations](#monitoring-provide-operations) above).
2074-
2075-
Whenever new keys should be advertised to the Amino DHT, `kubo` calls
2076-
`StartProviding()`, triggering an initial `provide` operation for the given
2077-
keys. The keys will be added to the `Keystore` tracking which keys should be
2078-
reprovided and when they should be reprovided. Calling `StopProviding()`
2079-
removes the keys from the `Keystore`. However, it is currently tricky for
2080-
`kubo` to detect when a key should stop being advertised. Hence, `kubo` will
2081-
periodically refresh the `Keystore` at each [`Provide.DHT.Interval`](#providedhtinterval)
2082-
by providing it a channel of all the keys it is expected to contain according
2083-
to the [`Provide.Strategy`](#providestrategy). During this operation,
2084-
all keys in the `Keystore` are purged, and only the given ones remain scheduled.
2104+
Enables the sweep provider for efficient content announcements. When disabled,
2105+
the legacy [`boxo/provider`](https://github.com/ipfs/boxo/tree/main/provider) is
2106+
used instead.
2107+
2108+
**The legacy provider problem:** The legacy system processes CIDs one at a
2109+
time, requiring a separate DHT lookup (10-20 seconds each) to find the 20
2110+
closest peers for each CID. This sequential approach typically handles less
2111+
than 10,000 CID over 22h ([`Provide.DHT.Interval`](#providedhtinterval)). If
2112+
your node has more CIDs than can be reprovided within
2113+
[`Provide.DHT.Interval`](#providedhtinterval), provider records start expiring
2114+
after
2115+
[`amino.DefaultProvideValidity`](https://github.com/libp2p/go-libp2p-kad-dht/blob/v0.34.0/amino/defaults.go#L40-L43),
2116+
making content undiscoverable.
2117+
2118+
**How sweep mode works:** The sweep provider divides the DHT keyspace into
2119+
regions based on keyspace prefixes. It estimates the Amino DHT size, calculates
2120+
how many regions are needed (sized to contain at least 20 peers each), then
2121+
schedules region processing evenly across
2122+
[`Provide.DHT.Interval`](#providedhtinterval). When processing a region, it
2123+
discovers the peers in that region once, then sends all provider records for
2124+
CIDs allocated to those peers in a batch. This batching is the key efficiency:
2125+
instead of N lookups for N CIDs, the number of lookups is bounded by a constant
2126+
fraction of the Amino DHT size (e.g., ~3,000 lookups when there are ~10,000 DHT
2127+
servers), regardless of how many CIDs you're providing.
2128+
2129+
**Efficiency gains:** For a node providing 100,000 CIDs, sweep mode reduces
2130+
lookups by 97% compared to legacy. The work spreads smoothly over time rather
2131+
than completing in bursts, preventing resource spikes and duplicate
2132+
announcements. Long-running nodes reprovide systematically just before records
2133+
would expire, keeping content continuously discoverable without wasting
2134+
bandwidth.
2135+
2136+
**Implementation details:** The sweep provider tracks CIDs in a persistent
2137+
keystore. New content added via `StartProviding()` enters the provide queue and
2138+
gets batched by keyspace region. The keystore is periodically refreshed at each
2139+
[`Provide.DHT.Interval`](#providedhtinterval) with CIDs matching
2140+
[`Provide.Strategy`](#providestrategy) to ensure only current content remains
2141+
scheduled. This handles cases where content is unpinned or removed.
20852142

20862143
**Persistent reprovide cycle state:** When Provide Sweep is enabled, the
20872144
reprovide cycle state is persisted to the datastore by default. On restart, Kubo
@@ -2100,13 +2157,15 @@ to `false`.
21002157
> <img alt="Reprovide Cycle Comparison" src="https://github.com/user-attachments/assets/e1662d7c-f1be-4275-a9ed-f2752fcdcabe">
21012158
> </picture>
21022159
>
2103-
> The diagram above visualizes the performance patterns:
2160+
> The diagram compares performance patterns:
21042161
>
2105-
> - **Legacy mode**: Individual (slow) provides per CID, can struggle with large datasets
2106-
> - **Sweep mode**: Even distribution matching the keyspace sweep described with low resource usage
2107-
> - **Accelerated DHT**: Hourly traffic spikes with high resource usage
2162+
> - **Legacy mode**: Sequential processing, one lookup per CID, struggles with large datasets
2163+
> - **Sweep mode**: Smooth distribution over time, batched lookups by keyspace region, predictable resource usage
2164+
> - **Accelerated DHT**: Hourly network crawls creating traffic spikes, high resource usage
21082165
>
2109-
> Sweep mode provides similar effectiveness to Accelerated DHT but with steady resource usage - better for machines with limited CPU, memory, or network bandwidth.
2166+
> Sweep mode achieves similar effectiveness to the Accelerated DHT client but with steady resource consumption.
2167+
2168+
You can compare the effectiveness of sweep mode vs legacy mode by monitoring the appropriate metrics (see [Monitoring Provide Operations](#monitoring-provide-operations) above).
21102169

21112170
> [!NOTE]
21122171
> This feature is opt-in for now, but will become the default in a future release.

docs/examples/kubo-as-a-library/go.mod

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ go 1.25
77
replace github.com/ipfs/kubo => ./../../..
88

99
require (
10-
github.com/ipfs/boxo v0.35.1-0.20251016232905-37006871a40e
10+
github.com/ipfs/boxo v0.35.0
1111
github.com/ipfs/kubo v0.0.0-00010101000000-000000000000
12-
github.com/libp2p/go-libp2p v0.44.0
12+
github.com/libp2p/go-libp2p v0.43.0
1313
github.com/multiformats/go-multiaddr v0.16.1
1414
)
1515

@@ -82,7 +82,7 @@ require (
8282
github.com/ipfs/go-ds-flatfs v0.5.5 // indirect
8383
github.com/ipfs/go-ds-leveldb v0.5.2 // indirect
8484
github.com/ipfs/go-ds-measure v0.2.2 // indirect
85-
github.com/ipfs/go-ds-pebble v0.5.5 // indirect
85+
github.com/ipfs/go-ds-pebble v0.5.3 // indirect
8686
github.com/ipfs/go-dsqueue v0.1.0 // indirect
8787
github.com/ipfs/go-fs-lock v0.1.1 // indirect
8888
github.com/ipfs/go-ipfs-cmds v0.15.0 // indirect
@@ -98,7 +98,7 @@ require (
9898
github.com/ipfs/go-peertaskqueue v0.8.2 // indirect
9999
github.com/ipfs/go-test v0.2.3 // indirect
100100
github.com/ipfs/go-unixfsnode v1.10.2 // indirect
101-
github.com/ipld/go-car/v2 v2.16.0 // indirect
101+
github.com/ipld/go-car/v2 v2.15.0 // indirect
102102
github.com/ipld/go-codec-dagpb v1.7.0 // indirect
103103
github.com/ipld/go-ipld-prime v0.21.0 // indirect
104104
github.com/ipshipyard/p2p-forge v0.6.1 // indirect
@@ -123,7 +123,7 @@ require (
123123
github.com/libp2p/go-libp2p-routing-helpers v0.7.5 // indirect
124124
github.com/libp2p/go-libp2p-xor v0.1.0 // indirect
125125
github.com/libp2p/go-msgio v0.3.0 // indirect
126-
github.com/libp2p/go-netroute v0.3.0 // indirect
126+
github.com/libp2p/go-netroute v0.2.2 // indirect
127127
github.com/libp2p/go-reuseport v0.4.0 // indirect
128128
github.com/libp2p/go-yamux/v5 v5.0.1 // indirect
129129
github.com/libp2p/zeroconf/v2 v2.2.0 // indirect
@@ -141,7 +141,7 @@ require (
141141
github.com/multiformats/go-multiaddr-dns v0.4.1 // indirect
142142
github.com/multiformats/go-multiaddr-fmt v0.1.0 // indirect
143143
github.com/multiformats/go-multibase v0.2.0 // indirect
144-
github.com/multiformats/go-multicodec v0.10.0 // indirect
144+
github.com/multiformats/go-multicodec v0.9.2 // indirect
145145
github.com/multiformats/go-multihash v0.2.3 // indirect
146146
github.com/multiformats/go-multistream v0.6.1 // indirect
147147
github.com/multiformats/go-varint v0.1.0 // indirect
@@ -177,7 +177,7 @@ require (
177177
github.com/prometheus/common v0.66.1 // indirect
178178
github.com/prometheus/procfs v0.17.0 // indirect
179179
github.com/quic-go/qpack v0.5.1 // indirect
180-
github.com/quic-go/quic-go v0.55.0 // indirect
180+
github.com/quic-go/quic-go v0.54.1 // indirect
181181
github.com/quic-go/webtransport-go v0.9.0 // indirect
182182
github.com/rogpeppe/go-internal v1.14.1 // indirect
183183
github.com/spaolacci/murmur3 v1.1.0 // indirect
@@ -212,22 +212,22 @@ require (
212212
go.uber.org/zap/exp v0.3.0 // indirect
213213
go.yaml.in/yaml/v2 v2.4.3 // indirect
214214
go4.org v0.0.0-20230225012048-214862532bf5 // indirect
215-
golang.org/x/crypto v0.43.0 // indirect
216-
golang.org/x/exp v0.0.0-20251009144603-d2f985daa21b // indirect
217-
golang.org/x/mod v0.29.0 // indirect
218-
golang.org/x/net v0.46.0 // indirect
215+
golang.org/x/crypto v0.42.0 // indirect
216+
golang.org/x/exp v0.0.0-20250911091902-df9299821621 // indirect
217+
golang.org/x/mod v0.28.0 // indirect
218+
golang.org/x/net v0.44.0 // indirect
219219
golang.org/x/sync v0.17.0 // indirect
220-
golang.org/x/sys v0.37.0 // indirect
221-
golang.org/x/telemetry v0.0.0-20251008203120-078029d740a8 // indirect
222-
golang.org/x/text v0.30.0 // indirect
220+
golang.org/x/sys v0.36.0 // indirect
221+
golang.org/x/telemetry v0.0.0-20250908211612-aef8a434d053 // indirect
222+
golang.org/x/text v0.29.0 // indirect
223223
golang.org/x/time v0.12.0 // indirect
224-
golang.org/x/tools v0.38.0 // indirect
224+
golang.org/x/tools v0.37.0 // indirect
225225
golang.org/x/xerrors v0.0.0-20240903120638-7835f813f4da // indirect
226226
gonum.org/v1/gonum v0.16.0 // indirect
227227
google.golang.org/genproto/googleapis/api v0.0.0-20250825161204-c5933d9347a5 // indirect
228228
google.golang.org/genproto/googleapis/rpc v0.0.0-20250825161204-c5933d9347a5 // indirect
229229
google.golang.org/grpc v1.75.0 // indirect
230-
google.golang.org/protobuf v1.36.10 // indirect
230+
google.golang.org/protobuf v1.36.9 // indirect
231231
gopkg.in/yaml.v3 v3.0.1 // indirect
232232
lukechampine.com/blake3 v1.4.1 // indirect
233233
)

0 commit comments

Comments
 (0)