-
Notifications
You must be signed in to change notification settings - Fork 67
Open
Description
Form my logs it would seem that during the prefill refresh if the zone file download fails it sometimes causes the cache FS to fails for some unknown reason see logs
Aug 02 19:18:05 HOSTNAME kresd[584737]: [prefil] root zone file valid for 13 seconds, reusing data from disk
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME kresd[584747]: [prefil] downloading root zone to file root.zone ...
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr): SIGBUS received; this is most likely due to filling up the filesystem where cache resides.
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: captured stdio output from kresd1[584737] (stderr):
Aug 02 19:18:05 HOSTNAME supervisord[584725]: exited: kresd1 (exit status 121; not expected)
Aug 02 19:18:05 HOSTNAME aiohttp.server[584733]: Error handling request
Traceback (most recent call last):
File "/usr/lib64/python3.9/site-packages/aiohttp/web_protocol.py", line 452, in _handle_request
resp = await request_handler(request)
File "/usr/lib64/python3.9/site-packages/aiohttp/web_app.py", line 543, in _handle
resp = await handler(request)
File "/usr/lib64/python3.9/site-packages/aiohttp/web_middlewares.py", line 114, in impl
return await handler(request)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/server.py", line 62, in error_handler
return await handler(request)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/server.py", line 243, in _handler_metrics
body=await statistics.report_stats(),
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/statistics.py", line 338, in report_stats
await _resolver_collector.collect_kresd_stats()
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/statistics.py", line 96, in collect_kresd_stats
stats_raw = await command_registered_workers(cmd)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/registered_workers.py", line 34, in command_registered_workers
pairs = await asyncio.gather(*(single_pair(inst) for inst in _REGISTERED_WORKERS.values()))
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/registered_workers.py", line 32, in single_pair
return sub.id, await sub.command(cmd)
File "/usr/lib64/python3.9/site-packages/knot_resolver_manager/kresd_controller/interface.py", line 173, in command
reader, writer = await asyncio.open_unix_connection(f"./control/{int(self.id)}")
File "/usr/lib64/python3.9/asyncio/streams.py", line 111, in open_unix_connection
transport, _ = await loop.create_unix_connection(
File "/usr/lib64/python3.9/asyncio/unix_events.py", line 244, in create_unix_connection
await self.sock_connect(sock, path)
File "/usr/lib64/python3.9/asyncio/selector_events.py", line 500, in sock_connect
return await fut
File "/usr/lib64/python3.9/asyncio/selector_events.py", line 505, in _sock_connect
sock.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
Aug 02 19:18:05 HOSTNAME aiohttp.access[584733]: 10.157.2.182 [02/Aug/2024:19:18:05 +0000] "GET /metrics HTTP/1.1" 500 244 "-" "Prometheus/2.53.1"
Aug 02 19:18:06 HOSTNAME supervisord[584725]: spawned: 'kresd1' with pid 596546
Aug 02 19:18:06 HOSTNAME kresd[596546]: [system] path = /run/knot-resolver/control/1
Aug 02 19:18:06 HOSTNAME kresd[596546]: [rules ] cleared 1 stale reader locks
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] root zone file valid for 09 minutes 55 seconds, reusing data from disk
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] empty zone file
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] error parsing zone file `root.zone`
Aug 02 19:18:06 HOSTNAME kresd[584748]: [prefil] root zone import failed (/usr/lib64/knot-resolver/kres_modules/prefill.lua:97: [prefil] zone import failed: cdata<const char *>: 0x7f11ac7c532d), retry in 09 minutes 52 seconds
Aug 02 19:18:08 HOSTNAME kresd[584746]: [rules ] cleared 1 stale reader locks
Aug 02 19:18:08 HOSTNAME kresd[584738]: [prefil] downloading root zone to file root.zone ...
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [cache ] LMDB refusing writes (ignored for 5-9s): Invalid argument
Aug 02 19:18:08 HOSTNAME kresd[596546]: [system] error while loading config: kresd1.conf:263: can't open cache path '/var/cache/knot-resolver'; working directory '/run/knot-resolver'; No space left on device (workdir '/run/knot-resolver')
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [tasign] signalling query triggered: _ta-4f66.
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [system] assertion "env->is_cache" failed in cdb_write@../lib/cache/cdb_lmdb.c:689
Aug 02 19:18:08 HOSTNAME kresd[584753]: [tasign] signalling query triggered: _ta-4f66.
Aug 02 19:18:08 HOSTNAME supervisord[584725]: exited: kresd1 (exit status 1; not expected)
Please here is my current config
rundir: /run/knot-resolver
workers: 16
cache:
storage: /var/cache/knot-resolver
size-max: 20G
prefill:
- origin: "."
url: https://www.internic.net/domain/root.zone
refresh-interval: 6h
logging:
level: info
groups:
- system
- net
- dnssec
- resolv
- select
- exterr
- policy
network:
listen:
- interface: lo@53
kind: dns
- interface: ens192@53
kind: dns
- interface: ens192
kind: dot
management:
interface: 10.157.2.180@5000
# unix-socket: /run/knot-resolver/manager.sock
and the output of my file system structure
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 6.2G 696M 5.5G 12% /run
/dev/mapper/vg_oel-lv_root 246G 5.5G 228G 3% /
/dev/sda2 960M 418M 543M 44% /boot
/dev/sda1 1022M 6.3M 1016M 1% /boot/efi
/dev/mapper/vg_oel-lv_var 127G 29G 92G 24% /var
tmpfs 27G 21G 7.0G 75% /var/cache/knot-resolver
tmpfs 3.1G 0 3.1G 0% /run/user/1001
Metadata
Metadata
Assignees
Labels
No labels