The gfid of the directory will be split when multiple clients run at the same time to create and delete the same directory #4424

huping-cmss · 2024-10-21T08:46:06Z

Description of problem:
Mount the glusterfs fuse client on all three nodes, and then run the following script simultaneously with all three mounts:
while true do cp -r ~/testdir12 /mnt/gluster/testvol/testdir11/ rm -rf /mnt/gluster/testvol/testdir11/testdir12 done
/mnt/gluster/testvol/ is the mountpoint

then
An error “[Stale file handle” occurred of accessing the directory

The exact command to reproduce the issue:

The full output of the command that failed:

Expected results:

Mandatory info:
- The output of the gluster volume info command:
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: e3f2e610-891e-4cdd-a7a7-be0eaa5a4b5e
Status: Started
Snapshot Count: 0
Number of Bricks: 4 x 2 = 8
Transport-type: tcp
Bricks:
Brick1: node0001:/gluster/brick-0/testvol2
Brick2: node0002:/gluster/brick-6/testvol2
Brick3: node0001:/gluster/brick-1/testvol2
Brick4: node0002:/gluster/brick-7/testvol2
Brick5: node0001:/gluster/brick-2/testvol2
Brick6: node0003:/gluster/brick-3/testvol2
Brick7: node0002:/gluster/brick-8/testvol2
Brick8: node0003:/gluster/brick-4/testvol2
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
storage.fips-mode-rchecksum: on
cluster.granular-entry-heal: on
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
diagnostics.client-log-level: TRACE
cluster.server-quorum-ratio: 51%
cluster.enable-shared-storage: enable
cluster.daemon-log-level: INFO

- The output of the `gluster volume status` command:
Status of volume: testvol
Gluster process TCP Port RDMA Port Online Pid

Brick node0001:/gluster/brick-0/testvol2 49152 0 Y 486039
Brick node0002:/gluster/brick-6/testvol2 49152 0 Y 334762
Brick node0001:/gluster/brick-1/testvol2 49158 0 Y 424917
Brick node0002:/gluster/brick-7/testvol2 49154 0 Y 337339
Brick node0001:/gluster/brick-2/testvol2 49159 0 Y 424924
Brick node0003:/gluster/brick-3/testvol2 49153 0 Y 117508
Brick node0002:/gluster/brick-8/testvol2 49155 0 Y 337355
Brick node0003:/gluster/brick-4/testvol2 49155 0 Y 117926
Self-heal Daemon on localhost N/A N/A Y 474409
Quota Daemon on localhost N/A N/A Y 476337
Self-heal Daemon on efsnode0002 N/A N/A Y 861635
Quota Daemon on efsnode0002 N/A N/A Y 913346
Self-heal Daemon on efsnode0003 N/A N/A Y 391839
Quota Daemon on efsnode0003 N/A N/A Y 394644

Task Status of Volume testvol

There are no active volume tasks
- The output of the gluster volume heal command:
Brick node0001:/gluster/brick-0/testvol2
Status: Connected
Number of entries: 0

Brick node0002:/gluster/brick-6/testvol2
Status: Connected
Number of entries: 0

Brick node0001:/gluster/brick-1/testvol2
Status: Connected
Number of entries: 0

Brick node0002:/gluster/brick-7/testvol2
Status: Connected
Number of entries: 0

Brick node0001:/gluster/brick-2/testvol2
Status: Connected
Number of entries: 0

Brick node0003:/gluster/brick-3/testvol2
Status: Connected
Number of entries: 0

Brick node0002:/gluster/brick-8/testvol2
Status: Connected
Number of entries: 0

Brick node0003:/gluster/brick-4/testvol2
Status: Connected
Number of entries: 0

**- Provide logs present on following locations of client and server nodes -
/var/log/glusterfs/

**- Is there any crash ? Provide the backtrace and coredump

Additional info:

We found that when creating a directory, glusterfs would first create a directory on the hash subvolume and then create a directory on other subvolumes; however, when deleting a directory, Glusterfs would first delete the directories on other subvolumes and then delete the directories on the hash subvolume. In both cases, directory locks were only obtained on the hash subvolume, but all the locks on all subvolumes were not locked. This situation leads to the phenomenon of gfid brain splitting in the directory when creating and deleting the directory at the same time. During the process of deleting a directory, some directories on the non-hash subvolume are deleted, and during the process of creating a directory, the new hash subvolume is already on the non-hash subvolume corresponding to the deleted directory. As a result, both directory creation and directory deletion are taking place at the same time (the gfid of the newly created directory with the same name is different from that of the previous directory).

- The operating system / glusterfs version:
glusterfs 6.0

Note: Please hide any confidential data which you don't want to share in public like IP address, file name, hostname or any other configuration

The text was updated successfully, but these errors were encountered:

huping-cmss · 2024-10-21T08:53:09Z

We found that when creating a directory, glusterfs would first create a directory on the hash subvolume and then create the directory on other subvolumes; however, when deleting the directory, Glusterfs would first delete the directories on other subvolumes and then delete the directories on the hash subvolume. In both cases, directory locks were only obtained on the hash subvolume, but all the locks on all subvolumes were not locked. This situation leads to the phenomenon of gfid brain splitting in the directory when creating and deleting the directory at the same time. During the process of deleting a directory, some directories on the non-hash subvolume are deleted, and during the process of creating a directory, the new hash subvolume is already on the non-hash subvolume corresponding to the deleted directory. As a result, both directory creation and directory deletion are taking place at the same time (the gfid of the newly created directory with the same name is different from that of the previous directory).
Excuse me, dear teachers. Should Glusterfs obtain the global subvolume lock or only the inodelk lock of the hash subvolume when performing directory operations? We find that the above exception occurs when only the hash subvolume lock is obtained during the directory operation.
@amarts @avati @raghavendrabhat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The gfid of the directory will be split when multiple clients run at the same time to create and delete the same directory #4424

The gfid of the directory will be split when multiple clients run at the same time to create and delete the same directory #4424

huping-cmss commented Oct 21, 2024

- The output of the `gluster volume status` command:
Status of volume: testvol
Gluster process TCP Port RDMA Port Online Pid

Task Status of Volume testvol

huping-cmss commented Oct 21, 2024 •

edited

Loading

The gfid of the directory will be split when multiple clients run at the same time to create and delete the same directory #4424

The gfid of the directory will be split when multiple clients run at the same time to create and delete the same directory #4424

Comments

huping-cmss commented Oct 21, 2024

- The output of the gluster volume status command: Status of volume: testvol Gluster process TCP Port RDMA Port Online Pid

Task Status of Volume testvol

huping-cmss commented Oct 21, 2024 • edited Loading

- The output of the `gluster volume status` command:
Status of volume: testvol
Gluster process TCP Port RDMA Port Online Pid

huping-cmss commented Oct 21, 2024 •

edited

Loading