-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
I'm running a glusterfs distribute volume on two peers with 10 bricks. One of these bricks shall be removed, but the remove-brick operation repeatedly fails at a random point in time after running for 30 minutes up to 9 hours.
Status after failure:
$ gluster volume remove-brick tank node1:/mnt/md5/data status
Node Rebalanced-files size scanned failures skipped status run time in h:m:s
--------- ----------- ----------- ----------- ----------- ----------- ------------ --------------
node1 11276 2.7GB 136050 8 0 failed 3:31:10tank-rebalance.log on the peer running the remove-brick operation:
[2025-09-22 01:00:06.894942 +0000] E [rpc-clnt.c:167:call_bail] 0-tank-client-28: bailing out frame type(GlusterFS 4.x v1), op(ENTRYLK(31)), xid = 0x82f9d, unique = 1528357, sent = 2025-09-22 00:30:01 +0000, timeout = 1800 for 192.168.42.2:51408
[2025-09-22 01:00:06.895106 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:1418:client4_0_entrylk_cbk] 0-tank-client-28: remote operation failed. [{errno=107}, {error=Transport endpoint is not connected}]
[2025-09-22 01:00:06.924366 +0000] E [MSGID: 109023] [dht-rebalance.c:2872:gf_defrag_migrate_single_file] 0-tank-dht: migrate-data failed for /backups/cam/20210818/100828.jpg [Transport endpoint is not connected]
[2025-09-22 01:00:36.895197 +0000] E [rpc-clnt.c:167:call_bail] 0-tank-client-28: bailing out frame type(GlusterFS 4.x v1), op(ENTRYLK(31)), xid = 0x8330e, unique = 1533932, sent = 2025-09-22 00:30:32 +0000, timeout = 1800 for 192.168.42.2:51408
[2025-09-22 01:00:36.906739 +0000] E [MSGID: 109023] [dht-rebalance.c:2872:gf_defrag_migrate_single_file] 0-tank-dht: migrate-data failed for /backups/cam/20210818/163125.jpg [Transport endpoint is not connected]
[2025-09-22 01:00:36.895247 +0000] E [MSGID: 114031] [client-rpc-fops_v2.c:1418:client4_0_entrylk_cbk] 0-tank-client-28: remote operation failed. [{errno=107}, {error=Transport endpoint is not connected}]
From the bailing out frame message, I suppose an operation was not answered by a brick after 1800 seconds. I ran a wireshark dump of the network traffic between the two nodes where the original operation is visible. There is not reply recorded however, which makes the rebalance process abort 1800 seconds after the request has been sent. Notice that wireshark appends the reply packet ID to the description of each call, but for the unanswered call no reply is found. The network connection seems fine, there is no dropped packets and the requests shows up in packet captures on both peers.
I ran another remove-brick operation with diagnostics.brick-log-level DEBUG, but no suspicious message appears at the time where the operation is lost. I'm not entirely sure if the log snippet is from the right ENTRYLK call since ~5 calls are received within the second given by the rebalance log timestamp, but the most interesting is one where 3 calls are scheduled at once:
[2025-09-22 00:30:01.601104 +0000] D [MSGID: 0] [server-rpc-fops_v2.c:2638:server4_inodelk_resume] 0-/mnt/md0/data: frame 0x60947771b538, xlator 0x609476b77618
[2025-09-22 00:30:01.601104 +0000] D [MSGID: 0] [io-threads.c:365:iot_schedule] 0-tank-io-threads: ENTRYLK scheduled as least priority fop
[2025-09-22 00:30:01.601194 +0000] D [MSGID: 0] [io-threads.c:365:iot_schedule] 0-tank-io-threads: SETATTR scheduled as normal priority fop
[2025-09-22 00:30:01.601199 +0000] D [MSGID: 0] [io-threads.c:365:iot_schedule] 0-tank-io-threads: INODELK scheduled as least priority fop
[2025-09-22 00:30:01.601272 +0000] D [MSGID: 0] [posix-metadata.c:127:posix_fetch_mdata_xattr] 0-tank-posix: No such attribute:trusted.glusterfs.mdata for file /mnt/md0/data/.glusterfs/13/c7/13c7ec74-5fa9-4ac4-bf97-948410ff7225 gfid: 13c7ec74-5fa9-4ac4-bf97-948410ff7225
Some observations I made throughout six tries:
- it happens on any brick on the remote node, not on any brick which is local to the one running the remove-brick operation
- the unanswered call is always a
ENTRYLKoperation - once, multiple
ENTRYLKoperations issued in a 3-second time frame remained unanswered and were reported with abailing out framemessage
Any suggestions what might be the cause, or on how to debug this further?
Full volume settings
------ -----
auth.allow *
auth.reject (null) (DEFAULT)
auth.ssl-allow *
changelog.capture-del-path off (DEFAULT)
changelog.changelog-barrier-timeout 120
changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs (DEFAULT)
changelog.changelog off (DEFAULT)
changelog.encoding ascii (DEFAULT)
changelog.fsync-interval 5 (DEFAULT)
changelog.rollover-time 15 (DEFAULT)
client.bind-insecure (null) (DEFAULT)
client.event-threads 8
client.keepalive-count 9
client.keepalive-interval 2
client.keepalive-time 20
client.send-gids on (DEFAULT)
client.ssl off
client.strict-locks off
client.tcp-user-timeout 0
cluster.background-self-heal-count 8 (DEFAULT)
cluster.brick-graceful-cleanup disable
cluster.brick-multiplex disable
cluster.choose-local true (DEFAULT)
cluster.consistent-metadata no (DEFAULT)
cluster.daemon-log-level INFO
cluster.data-change-log on (DEFAULT)
cluster.data-self-heal-algorithm (null) (DEFAULT)
cluster.data-self-heal off (DEFAULT)
cluster.dht-xattr-name trusted.glusterfs.dht (DEFAULT)
cluster.disperse-self-heal-daemon enable (DEFAULT)
cluster.eager-lock on (DEFAULT)
cluster.enable-shared-storage disable
cluster.ensure-durability on (DEFAULT)
cluster.entry-change-log on (DEFAULT)
cluster.entry-self-heal off (DEFAULT)
cluster.extra-hash-regex (null) (DEFAULT)
cluster.favorite-child-policy none (DEFAULT)
cluster.force-migration off
cluster.full-lock yes (DEFAULT)
cluster.granular-entry-heal no (DEFAULT)
cluster.halo-enabled False (DEFAULT)
cluster.halo-max-latency 5 (DEFAULT)
cluster.halo-max-replicas 99999 (DEFAULT)
cluster.halo-min-replicas 2 (DEFAULT)
cluster.halo-nfsd-max-latency 5 (DEFAULT)
cluster.halo-shd-max-latency 99999 (DEFAULT)
cluster.heal-timeout 600 (DEFAULT)
cluster.heal-timeout 600 (DEFAULT)
cluster.heal-wait-queue-length 128 (DEFAULT)
cluster.local-volume-name (null) (DEFAULT)
cluster.locking-scheme full (DEFAULT)
cluster.lock-migration off
cluster.lookup-optimize on (DEFAULT)
cluster.lookup-unhashed on (DEFAULT)
cluster.max-bricks-per-process 250
cluster.metadata-change-log on (DEFAULT)
cluster.metadata-self-heal off (DEFAULT)
cluster.min-free-disk 1%
cluster.min-free-inodes 5% (DEFAULT)
cluster.optimistic-change-log on (DEFAULT)
cluster.post-op-delay-secs 1 (DEFAULT)
cluster.quorum-count (null) (DEFAULT)
cluster.quorum-reads no (DEFAULT)
cluster.quorum-type none (DEFAULT)
cluster.randomize-hash-range-by-gfid off (DEFAULT)
cluster.readdir-optimize on
cluster.read-hash-mode 1 (DEFAULT)
cluster.read-subvolume-index -1 (DEFAULT)
cluster.read-subvolume (null) (DEFAULT)
cluster.rebalance-stats off (DEFAULT)
cluster.rebal-throttle normal
cluster.rmdir-optimize on (DEFAULT)
cluster.rsync-hash-regex (null) (DEFAULT)
cluster.self-heal-daemon on (DEFAULT)
cluster.self-heal-readdir-size 1KB (DEFAULT)
cluster.self-heal-window-size 8 (DEFAULT)
cluster.server-quorum-ratio 51
cluster.server-quorum-type off
cluster.shd-max-threads 1 (DEFAULT)
cluster.shd-wait-qlength 1024 (DEFAULT)
cluster.subvols-per-directory (null) (DEFAULT)
cluster.switch-pattern (null) (DEFAULT)
cluster.use-anonymous-inode yes
cluster.use-compound-fops off
cluster.weighted-rebalance on (DEFAULT)
config.brick-threads 16
config.client-threads 16
config.gfproxyd off
config.global-threading off
ctime.noatime on
debug.delay-gen off
debug.error-failure (null) (DEFAULT)
debug.error-fops (null) (DEFAULT)
debug.error-gen off
debug.error-number (null) (DEFAULT)
debug.exclude-ops (null) (DEFAULT)
debug.include-ops (null) (DEFAULT)
debug.log-file no (DEFAULT)
debug.log-history no (DEFAULT)
debug.random-failure off (DEFAULT)
debug.trace off
delay-gen.delay-duration 100000 (DEFAULT)
delay-gen.delay-percentage 10% (DEFAULT)
delay-gen.enable (DEFAULT)
dht.force-readdirp on (DEFAULT)
diagnostics.brick-log-buf-size 5 (DEFAULT)
diagnostics.brick-log-flush-timeout 120 (DEFAULT)
diagnostics.brick-log-format (null) (DEFAULT)
diagnostics.brick-logger (null) (DEFAULT)
diagnostics.brick-log-level DEBUG
diagnostics.brick-sys-log-level CRITICAL (DEFAULT)
diagnostics.client-log-buf-size 5 (DEFAULT)
diagnostics.client-log-flush-timeout 120 (DEFAULT)
diagnostics.client-log-format (null) (DEFAULT)
diagnostics.client-logger (null) (DEFAULT)
diagnostics.client-log-level ERROR
diagnostics.client-sys-log-level CRITICAL (DEFAULT)
diagnostics.count-fop-hits off
diagnostics.dump-fd-stats off (DEFAULT)
diagnostics.fop-sample-buf-size 65535 (DEFAULT)
diagnostics.fop-sample-interval 0 (DEFAULT)
diagnostics.latency-measurement off
diagnostics.stats-dnscache-ttl-sec 86400 (DEFAULT)
diagnostics.stats-dump-format json (DEFAULT)
diagnostics.stats-dump-interval 0 (DEFAULT)
disperse.background-heals 8 (DEFAULT)
disperse.cpu-extensions auto (DEFAULT)
disperse.eager-lock on (DEFAULT)
disperse.eager-lock-timeout 1 (DEFAULT)
disperse.heal-wait-qlength 128 (DEFAULT)
disperse.optimistic-change-log on (DEFAULT)
disperse.other-eager-lock on (DEFAULT)
disperse.other-eager-lock-timeout 1 (DEFAULT)
disperse.parallel-writes on (DEFAULT)
disperse.quorum-count 0 (DEFAULT)
disperse.read-policy gfid-hash (DEFAULT)
disperse.self-heal-window-size 32 (DEFAULT)
disperse.shd-max-threads 1 (DEFAULT)
disperse.shd-wait-qlength 1024 (DEFAULT)
disperse.stripe-cache 4 (DEFAULT)
features.acl enable
features.alert-time 86400 (DEFAULT)
features.auto-commit-period 180 (DEFAULT)
features.barrier disable
features.barrier-timeout 120
features.bitrot disable
features.cache-invalidation on
features.cache-invalidation-timeout 1800
features.cloudsync off
features.cloudsync-product-id (null) (DEFAULT)
features.cloudsync-remote-read off
features.cloudsync-store-id (null) (DEFAULT)
features.cloudsync-storetype (null) (DEFAULT)
features.ctime on
features.ctime on (DEFAULT)
features.default-retention-period 120 (DEFAULT)
features.default-soft-limit 80% (DEFAULT)
features.enforce-mandatory-lock off
features.expiry-time 120
features.failover-hosts (null) (DEFAULT)
features.hard-timeout 5 (DEFAULT)
feature.simple-quota-pass-through true
feature.simple-quota.use-backend false
features.inode-quota off
features.lease-lock-recall-timeout 60 (DEFAULT)
features.leases off
features.locks-monkey-unlocking false (DEFAULT)
features.locks-notify-contention-delay 5 (DEFAULT)
features.locks-notify-contention yes (DEFAULT)
features.locks-revocation-clear-all false (DEFAULT)
features.locks-revocation-max-blocked 0 (DEFAULT)
features.locks-revocation-secs 0 (DEFAULT)
features.quota-deem-statfs off
features.quota off
features.read-only off (DEFAULT)
features.retention-mode relax (DEFAULT)
features.scrub false (DEFAULT)
features.scrub-freq biweekly
features.scrub-throttle lazy
features.sdfs off
features.selinux on
features.shard-block-size 64MB (DEFAULT)
features.shard-deletion-rate 100 (DEFAULT)
features.shard-lru-limit 16384 (DEFAULT)
features.shard off
features.show-snapshot-directory off
features.signer-threads 4
features.snapshot-directory .snaps
features.soft-timeout 60 (DEFAULT)
features.tag-namespaces off
features.timeout 45 (DEFAULT)
features.trash-dir .trashcan (DEFAULT)
features.trash-eliminate-path (null) (DEFAULT)
features.trash-internal-op off (DEFAULT)
features.trash-max-filesize 5MB (DEFAULT)
features.trash off (DEFAULT)
features.uss off
features.worm-file-level off
features.worm-files-deletable on
features.worm off
ganesha.enable off
geo-replication.ignore-pid-check off
geo-replication.ignore-pid-check off
geo-replication.indexing off
geo-replication.indexing off
glusterd.vol_count_per_thread 100
locks.mandatory-locking off (DEFAULT)
locks.trace off (DEFAULT)
network.compression.compression-level 1 (DEFAULT)
network.compression.debug false (DEFAULT)
network.compression.mem-level 8 (DEFAULT)
network.compression.min-size 1024 (DEFAULT)
network.compression off
network.compression.window-size -15 (DEFAULT)
network.frame-timeout 1800 (DEFAULT)
network.inode-lru-limit 1000000
network.ping-timeout 42 (DEFAULT)
network.remote-dio disable (DEFAULT)
network.tcp-window-size (null) (DEFAULT)
network.tcp-window-size (null) (DEFAULT)
nfs.acl on (DEFAULT)
nfs.addr-namelookup off (DEFAULT)
nfs.auth-cache-ttl-sec (null) (DEFAULT)
nfs.auth-refresh-interval-sec (null) (DEFAULT)
nfs.disable on
nfs.drc off (DEFAULT)
nfs.drc-size 0x20000 (DEFAULT)
nfs.dynamic-volumes off (DEFAULT)
nfs.enable-ino32 no (DEFAULT)
nfs.event-threads 2 (DEFAULT)
nfs.export-dir (DEFAULT)
nfs.export-dirs on (DEFAULT)
nfs.exports-auth-enable (null) (DEFAULT)
nfs.export-volumes on (DEFAULT)
nfs.mem-factor 15 (DEFAULT)
nfs.mount-rmtab /var/lib/glusterd/nfs/rmtab (DEFAULT)
nfs.mount-udp off (DEFAULT)
nfs.nlm on (DEFAULT)
nfs.outstanding-rpc-limit 16 (DEFAULT)
nfs.port 2049 (DEFAULT)
nfs.ports-insecure off (DEFAULT)
nfs.rdirplus on (DEFAULT)
nfs.readdir-size (1 * 1048576ULL) (DEFAULT)
nfs.read-size (1 * 1048576ULL) (DEFAULT)
nfs.register-with-portmap on (DEFAULT)
nfs.rpc-auth-allow all (DEFAULT)
nfs.rpc-auth-null on (DEFAULT)
nfs.rpc-auth-reject none (DEFAULT)
nfs.rpc-auth-unix on (DEFAULT)
nfs.rpc-statd /sbin/rpc.statd (DEFAULT)
nfs.server-aux-gids off (DEFAULT)
nfs.trusted-sync off (DEFAULT)
nfs.trusted-write off (DEFAULT)
nfs.volume-access read-write (DEFAULT)
nfs.write-size (1 * 1048576ULL) (DEFAULT)
Option Value
performance.aggregate-size 128KB (DEFAULT)
performance.cache-capability-xattrs true (DEFAULT)
performance.cache-ima-xattrs true (DEFAULT)
performance.cache-invalidation on
performance.cache-max-file-size 0 (DEFAULT)
performance.cache-min-file-size 0 (DEFAULT)
performance.cache-priority (DEFAULT)
performance.cache-refresh-timeout 1 (DEFAULT)
performance.cache-samba-metadata false (DEFAULT)
performance.cache-size 1GB
performance.cache-size 1GB
performance.cache-swift-metadata false (DEFAULT)
performance.client-io-threads on
performance.ctime-invalidation false (DEFAULT)
performance.enable-least-priority on (DEFAULT)
performance.flush-behind on (DEFAULT)
performance.force-readdirp true (DEFAULT)
performance.global-cache-invalidation true (DEFAULT)
performance.high-prio-threads 16 (DEFAULT)
performance.io-cache on
performance.io-cache-pass-through false (DEFAULT)
performance.io-cache-size 32MB (DEFAULT)
performance.iot-cleanup-disconnected-reqs off (DEFAULT)
performance.io-thread-count 16
performance.iot-pass-through false (DEFAULT)
performance.iot-watchdog-secs (null) (DEFAULT)
performance.lazy-open yes (DEFAULT)
performance.least-prio-threads 1 (DEFAULT)
performance.low-prio-threads 16 (DEFAULT)
performance.md-cache-pass-through false (DEFAULT)
performance.md-cache-statfs off (DEFAULT)
performance.md-cache-timeout 600
performance.nfs.flush-behind on (DEFAULT)
performance.nfs.io-cache off
performance.nfs.io-threads off
performance.nfs.quick-read off
performance.nfs.read-ahead off
performance.nfs.stat-prefetch off
performance.nfs.strict-o-direct off (DEFAULT)
performance.nfs.strict-write-ordering off (DEFAULT)
performance.nfs.write-behind on
performance.nfs.write-behind-trickling-writes on (DEFAULT)
performance.nfs.write-behind-window-size 1MB (DEFAULT)
performance.nl-cache-limit 10MB
performance.nl-cache on
performance.nl-cache-pass-through false (DEFAULT)
performance.nl-cache-positive-entry on
performance.nl-cache-timeout 1800
performance.normal-prio-threads 16 (DEFAULT)
performance.open-behind on
performance.open-behind-pass-through false (DEFAULT)
performance.parallel-readdir on
performance.qr-cache-timeout 1 (DEFAULT)
performance.quick-read-cache-invalidation false (DEFAULT)
performance.quick-read-cache-size 128MB (DEFAULT)
performance.quick-read-cache-timeout 1 (DEFAULT)
performance.quick-read on
performance.rda-cache-limit 512MB
performance.rda-high-wmark 128KB (DEFAULT)
performance.rda-low-wmark 4096 (DEFAULT)
performance.rda-request-size 131072
performance.read-after-open yes (DEFAULT)
performance.read-ahead off
performance.read-ahead-page-count 4 (DEFAULT)
performance.read-ahead-pass-through false (DEFAULT)
performance.readdir-ahead on
performance.readdir-ahead-pass-through false (DEFAULT)
performance.resync-failed-syncs-after-fsync off (DEFAULT)
performance.stat-prefetch on
performance.strict-o-direct off (DEFAULT)
performance.strict-write-ordering off (DEFAULT)
performance.write-behind on
performance.write-behind-pass-through false (DEFAULT)
performance.write-behind-trickling-writes on (DEFAULT)
performance.write-behind-window-size 1MB (DEFAULT)
performance.xattr-cache-list (DEFAULT)
rebalance.ensure-durability on (DEFAULT)
server.allow-insecure on (DEFAULT)
server.all-squash off (DEFAULT)
server.anongid 65534 (DEFAULT)
server.anonuid 65534 (DEFAULT)
server.dynamic-auth on (DEFAULT)
server.event-threads 8
server.gid-timeout 300 (DEFAULT)
server.keepalive-count 9
server.keepalive-interval 2
server.keepalive-time 20
server.manage-gids off (DEFAULT)
server.outstanding-rpc-limit 64 (DEFAULT)
server.own-thread (null) (DEFAULT)
server.root-squash off (DEFAULT)
server.ssl off
server.statedump-path /var/run/gluster (DEFAULT)
server.tcp-user-timeout 42 (DEFAULT)
ssl.ca-list (null) (DEFAULT)
ssl.certificate-depth (null) (DEFAULT)
ssl.cipher-list (null) (DEFAULT)
ssl.crl-path (null) (DEFAULT)
ssl.dh-param (null) (DEFAULT)
ssl.ec-curve (null) (DEFAULT)
ssl.own-cert (null) (DEFAULT)
ssl.private-key (null) (DEFAULT)
storage.batch-fsync-delay-usec 0 (DEFAULT)
storage.batch-fsync-mode reverse-fsync (DEFAULT)
storage.build-pgfid off (DEFAULT)
storage.create-directory-mask 0777 (DEFAULT)
storage.create-mask 0777 (DEFAULT)
storage.fips-mode-rchecksum off (DEFAULT)
storage.force-create-mode 0000 (DEFAULT)
storage.force-directory-mode 0000 (DEFAULT)
storage.gfid2path on
storage.gfid2path-separator : (DEFAULT)
storage.health-check-interval 600
storage.health-check-timeout 120
storage.linux-aio off (DEFAULT)
storage.linux-io_uring off (DEFAULT)
storage.max-hardlinks 100 (DEFAULT)
storage.node-uuid-pathinfo off (DEFAULT)
storage.owner-gid -1 (DEFAULT)
storage.owner-uid -1 (DEFAULT)
storage.reserve 3
transport.address-family inet
transport.keepalive 1
transport.listen-backlog 1024
gluster volume status
Status of volume: tank
Gluster process TCP Port RDMA Port Online Pid
------------------------------------------------------------------------------
Brick node1:/mnt/md1/data 53879 0 Y 2278940
Brick node1:/mnt/md0/data 50861 0 Y 2278978
Brick node2:/mnt/md1/data 53946 0 Y 1363542
Brick node1:/mnt/md5/data 51931 0 Y 2279047
Brick node2:/mnt/md0/data 51408 0 Y 1363580
Brick node1:/mnt/md6/data 57994 0 Y 2279087
Brick node2:/mnt/md3/data 57386 0 Y 1363620
Brick node2:/mnt/md5/data 56964 0 Y 1363658
Brick node2:/mnt/md2/data 56353 0 Y 1363696
Brick node1:/mnt/md2/data 58383 0 Y 2279132
Brick node1:/mnt/md3/data 60034 0 Y 2279186
Brick node2:/mnt/md4/data 52472 0 Y 1363734
Task Status of Volume tank
------------------------------------------------------------------------------
Task : Remove brick
ID : 879e440d-ecd9-4054-9259-cfad4f5b3d51
Removed bricks:
node1:/mnt/md5/data
Status : failed